July 2008 - Posts

I'm on vacation this week in northern Michigan, enjoying the good weather and time with my family, and also reading quite a bit, including a book called Beautiful Code published by O'Reilly (more about that in another post). One chapter in particular got me fired up to revisit Euler Problem 14, which I started to work on back in May, but got stuck and have been thinking about it ever since.

The problem goes like this:

The following iterative sequence is defined for the set of positive integers:
 
n -> n/2 (n is even)
n -> 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13 40 20 10 5 16 8 4 2 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

I'm solving the Euler problems in Matlab, and I'm working on a 867 MHz Mac PowerBook G4. Sometimes I also limit myself to the time it takes to drain the battery, just so that I don't end up spending days and days solving a problem (I have to work once in awhile, and my kids get upset if I ignore them too much :). So my goal is generally to come up with a fast solution, which in most cases rules out brute force.

With these goals in mind, I thought and thought and thought and thought about this problem, trying to come up with a trick that would allow me to rule out most numbers between 1 and 1,000,000. The most obvious thing to try is only the odd numbers. But I wanted to winnow down my search set even further. Not coming up with anything, I gave up and abandoned Euler problems altogether. I had plenty of other things to work on. But this problem kept gnawing at the edges of my brain.

With so many problems in life, sometimes it's best to stop thinking and start doing. So finally this week I threw in the towel and wrote some code to do a brute force search on all odd numbers. It definitely occured to me that once a sequence has been generated for a number, when that number shows up in a subsequent sequence I should reuse what has already been computed. But I got hung up in figuring out how big I should preallocate my array to store the sequence count for each integer, so I even left off that. I just wrote a loop that computes the sequence for every number.

Imagine my surprise when it took only about 15 or 20 minutes to run!

Sometimes brute force isn't so bad. Sometimes the only way to know is to try it.

Here's my code:

function euler14

max_count = 0;
max_nn = 1;
for nn = 3:2:1000000

    new_nn = nn;
    count = 0;

    while new_nn ~= 1
        if mod(new_nn,2) == 0
            new_nn = new_nn/2;
        else
            new_nn = new_nn*3+1;
        end
        count = count + 1;
    end

    if count > max_count
        max_count = count;
        max_nn = nn;
    end

    disp([num2str(nn), ': ', num2str(count)]);
end

disp(['Max: ', num2str(max_count), ' for nn = ', num2str(max_nn)]);

In the past year I've worked on two projects that involved piecing through someone else's Matlab code, making sense of it, making it work, and figuring out how to link to it from a Windows desktop application. In both cases, it was clear that the Matlab had been writen by someone who does not normally spend a lot of time writing code. All of the tenets that we developers hold sacred had been broken: global variables abounded; code was copied over and over; large sections of code were commented out for no apparent reason; and in one of the projects, functions were dispensed with all together in favor of scripts, which often relied on variables that had been declared and set elsewhere in another script. In both cases the original author of the code was not available to explain how things were supposed to work, so fixing the code was a lengthy, tedious process.

It got me wondering...why do engineers write bad Matlab code? Most engineers have had at least one programming course in college, and presumably in that course they are taught how to organize code, declare functions, and pass arguements. Now granted, some people have a harder time than others in figuring out a good architecture for a project. But my observation is that engineers writing Matlab seem to have an aversion to declaring functions. They would rather copy and paste code 20 times than make that code into a function.

And then, as I was working on my most recent Matlab project, it occurred to me what the problem is. When I write C# or C++ code with Microsoft Visual Studio 2005 or 2008, I have IntelliSense, which is a godsend because I can never remember the order in which arguements are supposed to be passed to a function. Matlab does not have anything like IntelliSense, and it has lots and lots of functions, most of them overloaded or with a variable number of arguments. I can't be the only scatter-brained engineer out there who has trouble keeping track of all of my options.

So this blog post is really a plea to The MathWorks folks - give us IntelliSense!

At a minimum, I'd be happy if there were a window that I could have open that would automatically show me the help file for a function as I type its name into an M-file. This would be similar to the context-sensitive help that's available in Labview, another engineering tool that has lots and lots of functions that are hard (for me) to keep track of.