Wednesday, October 1, 2008

Score one for induction!

One of my favorite textbooks on algorithms—in spite of the fact that it's twenty years old—is Udi Manber's Introduction to Algorithms: A Creative Approach. However, I have never been tempted to actually use it in class. You see, the book's greatest strength is also its greatest weakness.

Most textbooks on algorithms show you a lot of polished algorithms, but provide little or no insight on how to go about designing such algorithms. Manber's book does a very good job providing such insight, by making an analogy with inductive proofs. If you have any skill with induction, then you can use Manber's approach to help you design algorithms. (Of course, this should come as no surprise since writing an inductive proof and writing a recursive function are essentially the same activity.)

I'm sure you see the flaw here—most students are even less comfortable with induction than they are with recursion. It's like trying to help somebody learn to drive a car by making an analogy with riding a unicycle. For a certain segment of the audience, that analogy might be just what they need, but for the most of the audience, the analogy will only produce puzzled looks.

I was recently helping a student who was struggling to get a handle on recursion. He was making the common beginner mistake of trying to think about what happens during the recursive call instead of trusting that the recursive call works (sometimes called the recursive leap of faith). By sheer coincidence, I had happened to notice the day before that this student had done very well in our discrete math course the previous year. Time to give it a try...

“You know when you're writing a proof by induction and at some point you use the inductive hypothesis? That's just like a recursive call. You don't worry about what happens inside the inductive hypothesis, you just say ‘Assume that it works for N-1...’ It's the same with recursion. You just assume that the recursive call works.”

That comparison actually seemed to help. Maybe Manber was onto something after all!

Thursday, September 25, 2008

Less than vs Greater than

From math, we know that X < Y is the same as Y > X. But in programming, they're not the same.

Oh, I'm not talking about situations where X and Y have side effects that interact badly. Instead, I'm talking about situations where one may be more error prone than the other.

Although the two expressions are mathematically equivalent, they're not linguistically equivalent. One focuses attention on something being smaller, and the other focuses attention on something being bigger. For example, it is a relatively common mistake to go the wrong way when searching a binary search tree. However, almost every time I run into this error, the student has also written the comparisons backward from the direction I think of as “natural”.

Consider the following function for determining whether a given key x appears in a binary search tree t:

  function member(x,t) is
     if t = null then return false
     if t.key = x then return true
     if t.key < x then return member(x,t.left)
     if t.key > x then return member(x,t.right)
The fact that I've written this using recursion instead of iteration is irrelevant. What matters is that the third case goes left when it should go right, and vice versa for the fourth case. But looking more carefully at that third case
     if t.key < x then return member(x,t.left)
I'm not particularly surprised by the error. The comparison t.key < x focuses attention on something being smaller, and the left side of the tree is where the smaller keys go. By asking whether x is smaller than t.key, rather than whether t.key is smaller than x, I think we're much less likely to fall into this trap.

Saturday, September 13, 2008

Hey, you got your loop in my recursion!

I've written before about trying to diagnose students' broken or ineffective mental models from the mistakes they make. Here's a mistake that I see frequently from students who are not yet comfortable with recursion.

Say you wanted to write a function to calculate the sum of a list using a loop. Many students could easily write something like

   function sum(list) is
      variable total := 0
      while list /= null do
         total := total + list.item
         list :=
      return total
But ask them to write the sum function using recursion, and you might get something like
   function sum(list) is
      variable total := 0
      if list = null then
         return total
         total := total + list.item
         return sum(
Of course, this code always returns 0. It's pretty clear that the writer had the iterative algorithm in mind, and doesn't understand that each recursive call to sum creates a new instance of the total variable.

When I watch such a student writing code like this, he often declares the variable immediately, before even beginning to think about what the recursive decomposition is going to look like, an almost spinal reflex conditioned by several semesters of writing loops. I can explain recursion until I'm hoarse and draw pictures until my hand cramps up, but I can't compete with the will-o'-the-wisp allure of that variable. Once it's there, it will almost inevitably lead the student to his doom in the bogs of iterative thinking.

One trick to help such a student is to break the cycle where it begins, by getting rid of that variable. Tell him to write the function without using any local or global variables. Or, if he really thinks he needs a variable, to declare it as a constant instead. Of course, there are times when a variable is perfectly appropriate inside a recursive function, but such examples can often be avoided until the student has a better grasp of recursion.

Thursday, July 31, 2008

In search of epiphany

I watched the movie Proof last night. I highly recommend it. It's not often that mathematicians get so much screen time.

The nature of mathematical creativity plays a large role in the movie. One comment was especially intriguing. (Minor spoiler ahead.) In describing the process of solving a particularly difficult problem, one character says, “It was like...connecting dots. Some nights I...I could connect three or four of them. And some nights they'd be really far apart. I'd have no idea how to get to the next one, if there was the next one.”

This description rings both true and false for me. The true part is I'd have no idea how to get to the next one, if there was the next one. This nicely summarizes what it's like to work on a hard problem. You don't know what to do next, and you're not sure that a solution even exists.

But the connecting dots part doesn't feel right. It makes the process sound linear, when the reality is anything but. In my experience, the really hard problems—the ones that require the most creativity—are most often solved by epiphany, by having the whole solution burst into your head, seemingly in an instant. That's when the connecting dots part happens, but it all happens at once, not a few a time.

Of course, the catch is that epiphany can't be forced. I'm reminded of the old joke about the lottery. A man prays fervently every night, “Please, Lord, let me win the lottery tomorrow.” After twenty years, a heavenly voice responds, “Would you buy a ticket already!”

No, you can't just sit around waiting for an epiphany to happen. Epiphanies are hard work! TANSTAAFL. So, how do you go about searching for epiphany? Every epiphany I've ever had has been the result of a two-pronged effort.

The first prong is building up sweat equity. I thoroughly explore the space around the problem, playing with the problem from many different angles. Of course, I'm actively trying to solve the problem during this time, because I might get lucky and stumble across a solution. But I don't worry too much if I don't seem to be getting anywhere. This part of the process is not so much about connecting the dots as about building lots and lots (and lots) of potential connections.

The second prong is staying alert. That's easy to say, but really hard to do over long periods of time. Eventually, I hope, some stray thought or some environmental trigger will start a cascade, where some subset of those potential connections will become actual connections, creating a path to the solution. Sometimes this happens when you're relaxed and your mind is drifting, such as in the shower or while driving. I remember once in grad school trying all afternoon to track down a bug. That night my wife and I went over to a friend's house for dinner. Everybody was shocked when, in the middle of beef stroganoff, I suddenly shouted out, “That's it!”

Other times the epiphany can happen while you're actively working on something else. For example, the key insight behind TradeMaximizer came while I was trying to put together a homework for my Algorithms class. I had been working off and on for weeks trying to find cycles in the trade graph. But I had set that aside and was coming up with homework problems, when I stumbled across a stray comment that talked about converting a directed graph into a bipartite graph. That sentence was enough to set off the chain reaction of connections and, within seconds, TradeMaximizer was born.

Probably the most famous story about scientific epiphany is that of Archimedes' bath. Funny story: Shortly after announcing TradeMaximizer to the math trade community on BoardGameGeek, I received a peevish email from another BoardGameGeek user who thought he also had a solution. He described how he had come to BoardGameGeek, eager to share his discovery, only to find my announcement first. He complained “Argh! Way to shout ‘Eureka!’ while Archimedes is drying himself off!” (The other solution turned out to be bogus.)

Wednesday, July 16, 2008

Games for Programmers: Black Vienna

The next in an irregular series of board game recommendations for programmers. Of course, you don't have to be a programmer to enjoy these games, but if you are a programmer, then I think there's a pretty good chance that you'll like them.

O frabjous day!

My favorite deduction game, Black Vienna, has long been out of print, but now you can give it a try online, thanks to the efforts of Greg Aleknevicus. Go check out Black Vienna Online. It's free and easy to use. The user interface isn't going to win any awards, but it gets the job done. I'm thrilled that this game can now reach a wider audience.

Most deduction games are for two players, but Black Vienna handles three to six players. Your goal is to figure out the members of the infamous “Black Vienna” gang. There are 27 suspect cards, labeled A-Z and Ö. At the beginning of the game, three of these cards are chosen randomly and set aside. Those are the members of the gang.

The remaining 24 suspect cards are dealt out evenly to the players. (With 5 players, one of the players gets only 4 cards.) These are the cards of suspects for whom you can provide alibis.

On your turn you conduct an investigation. There is a pool of available investigation cards, each showing three suspects. You choose one of the available investigation cards and place it in front of another player. That player marks the card with plastic chips, according to how many of the suspects are in her hand. For example, if I have the cards BGMTXY and somebody plays the BQT card on me, I would mark it with two chips (for the B and the T). Other players would then know that I had two of the three suspects, but not necessarily which ones.

The turn then passes to the player who was just investigated.

One nice thing about Black Vienna as a game is that every player is involved on every turn. Even if you are not actively part of the investigation, you still need to update your information with the results of the investigation, and follow any chains of deductions to their conclusions.

Example Deductions

Some deductions are obvious. For example, if I played two chips on the BQT card, and you have the Q card in your hand, then you know that I have the B and the T. Other deductions are less obvious. Here's an actual example from a recent game. Call the other players Caitlin, Monica, Drew, and Dom.

Monica played one chip on GPX, but I had the X, so I knew that she had either G or P (but not both). Similarly, she also played one chip on DHR, but I knew Dom had the R, so Monica had either D or H. Finally, she played two chips on LNV.

Drew played one chip on DVY and zero chips on FÖY, so he had either D or V. Between Drew and Monica, they had at least 4 of the 5 cards DHLNV and at least 5 of the 7 cards DGHLNPV (possibly more if Drew had any of GHLNP).

Meanwhile, I knew all but two of Caitlin's cards. The only possibilities for those last two cards were DGHLNPV. Monica and Drew together had at least 5 of those 7 cards, so Monica, Drew, and Caitlin together must have all 7.

This tells me that none of these 7 cards are in the gang, and none of these 7 cards are in Dom's hand. It also tells me that Drew does not have any of GHLNP.

Finally, I can be more specific about Caitlin's cards. One of her two cards is G or P, and the other is one of DHLNV.

Online Play

The online implementation is asynchronous. You do not need to be logged in at the same time as other players. Instead, the system emails you whenever a turn is taken with the results of the investigation. When it is your turn, you can log on and take your turn at your leisure. This means that a 45-minute face-to-face game can take days or even weeks when played online, but it also means that you can play the game with odd minutes here and there, instead of having to carve out a solid chunk of time. An additional benefit is that it is easy to play with friends in different time zones. For example, I frequently play with friends on the other side of the country.

However, the greatest benefit of the online implementation is that it does away with the greatest weakness of face-to-face play—player errors. Not errors in deductions, which would be your own fault and would only affect you, but errors in answering investigations. If a player carelessly puts out, say, one chip when he should have put out two chips, it ruins the game for everyone. Even if the player notices the error a few turns later and tries to correct it, it's far too late. By then, the other players have already updated their notes with deductions made from the faulty answer, and typically have no way of backing out of those deductions.

If you want to give Black Vienna a try, see the detailed rules of the game and the instructions for the online interface.

Addendum: I've played in several three-player games recently. It's ok with that number, but I think it's significantly better with four or five players. (I haven't tried six yet.) With three players, the deductions are less interesting, and there is a greater chance for one player to get lucky.

Tuesday, July 8, 2008

Breadth-First Numbering: An Algorithm in Pictures

I've always enjoyed the mathematical tradition of “proofs without words”, so I wanted to see if something similar could be done with an algorithm. Of course, an algorithm is typically more complicated than a mathematical identity, so it's not surprising that an algorithm typically can't be captured in a single picture. Instead, I've found the idea of a commutative diagram to be a useful pictorial shorthand, where the short way around the diagram gives a high-level view of the operation, and the long way around gives more detail. Here is an attempt to describe my breadth-first numbering algorithm from ICFP 2000 using this idea.




Wednesday, July 2, 2008

Functional Programming in...Ada?

Ada is not the first language that comes to mind when you want to do functional programming. (You in the back, stop laughing!) Why, with excellent functional programming languages like Haskell or ML easily available, would I choose to do a functional programming project in Ada? (I CAN STILL HEAR YOU LAUGHING!) Partly because we use Ada in some courses at our institution, and I wanted a library that could help in teaching recursion to novices. But, honestly, mostly I just wanted to see if it could be done.

“It's like this fellow I knew in El Paso. One day, he just took all his clothes off and jumped in a mess of cactus. I asked him that same question, ‘Why?’...He said, ‘It seemed like a good idea at the time.’”
– Steve McQueen, The Magnificent Seven

Somewhat to my surprise, functional programming actually works reasonably well in Ada. You wouldn't want to program a large system this way, but small-scale projects are quite possible, albeit much more verbose than functional programmers are used to.

Instrumenting recursive functions

My application is going to be a library for instrumenting recursive functions. I want to be able to take a recursive function and run it with different combinations of monitors. Each monitor executes a little bit of extra code every time the recursive function is called, including recursive calls. Simple examples of useful monitors include counting the number of calls, or tracking the maximum call depth, or printing trace information on every call or return. More complicated examples include adding memoization to a recursive function or automatically detecting when a function is stuck in an infinite recursion.

Ideally, the library should be easy enough for students to use, but I'll settle for it being easy enough for instructors to use.

Of course, there are many ways to design such a library. I'm going to use a design based on ideas from functional programming. The basic ideas have been folklore in the functional programming community for a long time. For example, I remember doing something similar in SML in the early '90s. Bruce McAdam wrote a nice tech report describing these techniques in great detail. The trick here is going to be figuring out how to implement this design in Ada.

(Incidentally, if anybody has a truly object-oriented approach to solving the same problem, I'd be interested in seeing it.)

Preparing the recursive function

Without some kind of support for reflection, it seems too much to hope that I'll be able to observe the recursive calls inside a recursive function. Therefore, I'm willing to require some modifications to the recursive function, as long as those modifications are fairly straightforward. However, I should only need to modify the function once, rather than every time I run it with a different monitor.

Stealing an idea from functional programming called the Y combinator, I'll modify each recursive function to take an extra argument that is itself a function (or, more specifically, a pointer to a function). The recursive function will call this passed-in function wherever it would normally call itself. The passed-in function will eventually call the recursive function, but may run some other code first.

Here's an example using the ever popular Fibonacci function. The original recursive function might be written

 function Fib(X : Integer) return integer is
   if X <= 1 then
     return 1;
     return Fib(X-1) + Fib(X-2);
   end if;
 end Fib;
Adding the extra argument and replacing the recursive calls with calls to the passed-in function yields
 function Fib(Rec : access function(X : Integer)
                    return Integer;
              X : Integer) return integer is
   if X <= 1 then
     return 1;
     return Rec(X-1) + Rec(X-2);
   end if;
 end Fib;
where the changes are shown in red. Note that access function(X : Integer) return Integer is the type of a pointer to a function that takes an integer and returns an integer. In the rest of this post, I'll assume that the functions being instrumented always take an integer and return an integer. If I can get this to work at all, then generalizing this to arbitrary argument and return types will be trivial using Ada's notion of generics.

Running the recursive function

Now, to simply run this function recursively without attaching a monitor, I'll call it with Run, which is responsible for “tying the knot” by somehow making the function call itself. The Run function is written

 function Run(Fun : ...type to be filled in 
              X : Integer) return Integer is
   function Rec(X : Integer) return Integer is
     Fun(Rec'Access, X);
   end Rec;
   return Rec(X);
 end Run;
Here, Fun is a function like the modified Fib above. Run defines a local helper functon, Rec, that merely passes itself to Fun.

I would call this function by writing something like

   ...Run(Fib'Access, 10)...
where Fib'Access is Ada-speak for a pointer to the Fib function. This calls Rec(10), which calls Fib(Rec'Access,10), which in turn calls Rec(9), which in turn calls Fib(Rec'Access,9), and so on. Eventually, the whole thing returns 89, as expected.

I left out the type of Fun above, because it starts to get unwieldy. I can figure out what this type should be by looking at the type of Fib. The full signature of Run is

 function Run(Fun : access function
                (Rec : access function(X:Integer)
                       return Integer;
                 X : Integer) return Integer;
              X : Integer) return Integer;

A small lie

What I'd really like to be able to do is define a type abbreviation

 type Fun_Type is access function
   (Rec : access function(X:Integer)
          return Integer;
    X : Integer) return Integer;
and then write the signature of Run as
 function Run(Fun : Fun_Type; 
              X : Integer) return Integer;
Unfortunately, I can't. Or rather, I can, but it turns out to be useless. The problem is that Ada is really paranoid about what you do with pointers to functions. In particular, it won't let a pointer to a function match a named type if that function is defined in a deeper lexical scope than the type definition. This restriction does not apply to so-called anonymous access types, so I'm forced to rewrite the entire type everywhere it's used rather than giving it a name and referring to the name.

However, to simplify the presentation, I'm going to pretend that I can use Fun_Type as defined above. Just remember that, in the real implementation, every occurrence of Fun_Type needs to be replaced with the more verbose anonymous type. I'll write Fun_Type in italics as a reminder that it's not the real code.

A simple monitor

Now, here is a simple monitor that counts the number of calls to the recursive function, and prints out that number at the end.

 function Count(Fun : Fun_Type;
                X : Integer) return Integer is

   Num_Calls : Integer := 0;

   function My_Fun
       (Rec : access function(X:Integer) 
              return Integer;
        X : Integer) return Integer is
     Num_Calls := Num_Calls + 1;
     return Fun(Rec,X);
   end My_Fun;

   Result : Integer := Run(My_Fun'Access,X);
   Put("Number of calls = ");
   return Result;
 end Count;

Now, when I call

   Put( Count(Fib'Access,10) );
I get
Number of calls = 177
where the first line is printed by the monitor and the second line is printed by the external Put.

The Count monitor creates a new function My_Fun to give to Run. My_Fun simply increments Num_Calls and calls Fun. Now, every time Fun makes a recursive call to Rec, Rec will call My_Fun, which calls Fun.

So basically, Count “wraps” Fun with a little bit of code to increment the Num_Calls variable, and this code gets executed every time the Rec function is called.

In addition, Count does a little bit of work around the top-level recursive call, namely initializing the Num_Calls variable to 0, and printing the number of calls at the end.

A memoization monitor

Here's a more complicated monitor. This one modifies the recursive function to do memoization (first cousin to dynamic programming). The idea is that, when a function is called multiple times on the same arguments, you can save time by remembering those arguments and the corresponding results. Then when the function is called with arguments you have seen before, you can simply return the stored result, instead of recomputing it.

 function Memoize(Fun : Fun_Type; 
                  X : Integer) return Integer is

   Results : array (0..X) OF Integer;
   Ready   : array (0..X) OF Boolean 
               := (others => False);

   function My_Fun
       (Rec : access function(X:Integer) 
              return Integer;
        X : Integer) return Integer is
     if not Ready(X) then
       Results(X) := Fun(Rec,X);
       Ready(X) := True;
     end if;
     return Results(X);
   end My_Fun;

   return Run(My_Fun'Access,X);
 end Memoize;
The difference can be dramatic.
   ...Memoize(Fib'Access, 30)...
executes the body of Fib a total of 31 times, whereas
   ...Run(Fib'Access, 30)...
executes the body of Fib almost 2.7 million times (2692537 times, to be exact).

I made an assumption above that, if the original argument was X, then the range of possible arguments in all the calls is 0..X. This is good enough to illustrate the idea, but a more general implementation might be parameterized by the bounds to use, or by a function that calculates those bounds from the original X.

Multiple monitors

How did I know above that Run executes the body of Fib 2692537 times? By running Fib with the Count monitor. How did I know that Memoize executes the body of Fib 31 times? By running Fib with both the Memoize and Count monitors together.

Except that, as currently written, the Memoize and Count monitors can't be used together. I can run Fib with one or the other, but not both simultaneously.

What I want is a way to layer one monitor on top of another, building arbitrary stacks of monitors. The clues to how to do this are already in the current implementations of Memoize and Count.

Notice that both Memoize and Count have the same interface as Run. In other words, a monitor is an alternative run function that adds some extra machinery to the basic Run. Further, notice that both Memoize and Count build a My_Fun function by wrapping some extra code around Fun and then run My_Fun using the Run function. The key insight is that we can parameterize Memoize and Count by which run function to use for running My_Fun. That might be the basic Run function, but it might also be a run function corresponding to a different monitor.

To achieve this parameterization, I'll use generics instead of passing function pointers. The existing definitions of Memoize and Count will still work, but they need to be preceded by additional code to take a run function as a generic parameter. For example,

   with function Run(Fun : Fun_Type; 
                     X : Integer) return Integer;
 function Count(Fun : Fun_Type; 
                X : Integer) return Integer;

 function Count(Fun : Fun_Type; 
                X : Integer) return Integer is
     ... Run(My_Fun'Access, X);
The new parts are in red. I didn't change the existing code of Count at all, but now the call to Run in the body of Count refers to the run function passed in as a generic parameter.

Using a monitor is now a two-stage process, first instantiating the generic function to create the desired run function, and then calling that run function. For example,

 function C_Run is new Count(Run);
 function MC_Run is new Memoize(C_Run);

 ...MC_Run(Fib'Access, 30)...
Notice that the basic Run function is always used in the first instantiation, where it serves as the base of the stack of monitors.

But wait! Something strange is happening here. This program displays 59 calls, not 31 calls as expected. However, if we layer the monitors in the opposite order

 function M_Run is new Memoize(Run);
 function CM_Run is new Count(M_Run);

 ...MC_Run(Fib'Access, 30)...
then we get the 31 calls.

It's really a question of when the memoization code checks for a repeated argument relative to when the counting code increments the counter. If the memoization check happens before the increment, we get 31 calls. If the memoization check happens after the increment, we get 59 calls. Either way, the body of the original Fib function is only executed 31 times.

And that's it, really. There's still a lot of grunt work involved in fleshing out a complete, robust library, but all the important design decisions have already been made. Implementing a wide variety of monitors is fairly easy using Count and Memoize as templates.

Generics vs function pointers

One aspect of this design deserves a little more discussion. I'm mixing two different styles of passing functions as arguments, using generics in some places and function pointers in other places. Why not just use one style consistently? Because the two styles of passing around functions have different strengths and weaknesses.

One major difference is that generics work much better than function pointers when you want to return a function. For example, the generic Count monitor takes a run function and returns a run function. It's easy to imagine writing Count as

 function Count(Run : Run_Type) 
                return Run_Type is ...
where Run_Type is the appropriate access function type. But, in practice, this doesn't work because Ada lacks closures. The natural way to write this would be
 function Count(Run : Run_Type) return Run_Type is

   function My_Run(Fun : Fun_Type; 
                   X : Integer) return Integer is

   return My_Run'Access;
 end Count;
but you can't return My_Run'Access because My_Run goes away as soon as Count returns.

Note that it is possible to get around this restriction against returning function pointers by re-writing strategic code fragments in continuation-passing style, so that the function pointers in question can be passed to a continuation instead of returned. This works fine on a small scale, but quickly blows the runtime stack when attempted on a large scale.

A second major difference is that it is easier to write third-order (or even higher-order) functions using function pointers than using generics. A first-order function takes an ordinary value as an argument. A second-order function takes a first-order function as an argument. A third-order function takes a second-order function as an argument, and so on. Now, consider a monitor like Count. Count takes a Run function, which takes a Fun function, which takes a Rec function, so Count is fourth-order!

Currently, I pass the Run functions using generics, but the Fun functions using function pointers. (Ignore the Rec functions for now.) Suppose I wanted to pass both Run functions and Fun functions using generics. This would mean that each Run function would take its Fun function as a generic parameter. But that means that, when we pass a Run function to a monitor like Count, we need to pass it as a generic function. In other words, Count would need to be a generic function that takes a generic function as an argument. That's not allowed in Ada. Or, more truthfully, it is sometimes allowed, but it's extremely painful.

In Ada, a generic function can neither take a generic function as an argument nor return a generic function as a result. However, a package can contain generic components, so you can sometimes fake it wrapping the argument or result in a package. For example, you can fake a generic function that returns a generic function by writing a generic package that contains a generic function as a component.

In a limited way, you can do the same thing in the other direction. If you want a generic function to take a generic function as an argument, you can fake it by wrapping the argument in a package, so that the outer generic function takes a package (which contains a generic function as a component). The catch is that you can't instantiate the outer generic function with just any old package meeting a certain specification, but only with packages that are themselves instantiations of a single generic package. Making all that work is an exercise in pain that I leave to the most masochistic of readers.

Wednesday, June 4, 2008

No Applause, Please

My school recently had its graduation. This year's graduating seniors (the computer science majors, anyway) are particularly memorable for me. For one thing, I had more success than usual in luring them into playing board games, especially Race for the Galaxy, Attika, RoboRally, and Ricochet Robots.

However, this group also stood out for a classroom behavior that I'm not used to—applause.

It started in Algorithms. In this course, when a homework is due, I usually have several students present their solutions. If the student solutions have not illustrated a point that I wanted to make, I will then present my solution, which is usually more elegant, and sometimes significantly more efficient as well. My goal in doing this is not to show them up, or to say “hey, dummy, this is what you should have written”, but rather to help them catch a glimpse of the beauty I see in an elegant algorithm, to inspire them with what is possible. (I do worry sometimes that fragile personalities might find the experience demoralizing, rather than inspiring.)

Last year, after presenting such a solution, I was very surprised when one particular student started clapping. He continued to do this in future weeks, and eventually pressured other students into joining him.

At first, I was taken aback, but over time, I got more and more frustrated. It felt like the applause was saying “this is so far beyond me that I could never hope to match it”, whereas I was hoping for a reaction more like “this may be beyond me right now, but, by golly, if I work at it, I'll be able to do that someday”. Eventually, I snapped, “C'mon, I'm not trying to impress you with my brilliance. I want you to impress me with your brilliance!”

Friday, May 23, 2008

A Story About Odd and Even Numbers

This is a story my son and I wrote together when he was four. He came up with the basic story. I helped with the English and with the turtle character.

The Seven Jellybeans

Once upon a time, there was a lizard and a snake who were friends. They played together and had fun everyday until one day, the lizard's daddy gave them a bag of jellybeans to share.

In the bag, there were seven jellybeans. The lizard divided the jellybeans into two piles. His pile had 5 jellybeans, and the snake's pile had 2 jellybeans.

The snake said, “Wait a minute. That's not fair. You have more jellybeans than I do. You'd better give me one more jellybean.”

So the lizard gave the snake one more jellybean. And now the lizard had 4 jellybeans, and the snake had 3 jellybeans. The snake said, “Hey, that's still not fair. You still have more jellybeans than I do. You'd better give me one more jellybean.”

So the lizard gave the snake one more jellybean. And now the lizard had 3 jellybeans, and the snake had 4 jellybeans. Now the snake was happy, but the lizard said, “Hey, that's not fair. Now you have more jellybeans than I do. You'd better give me back one jellybean.”

So the snake gave the lizard back one jellybean. And now the lizard had 4 jellybeans, and the snake had 3 jellybeans. The snake said, “Hey, what happened? Now you have more jellybeans again. You must have cheated!”

And then the lizard and the snake began to fight.

While they were fighting, their friend the turtle walked by. He asked them why they were fighting, and the lizard and the snake told him.

“Hmm,” said the turtle. “If I solve your problem, will you give me one jellybean?”

The lizard and the snake didn't want to share their jellybeans, but they didn't want to fight either, so they agreed.

The turtle slowly ate one jellybean. Then he turned around and began to walk away.

“Wait a minute,"” shouted the lizard and the snake together. “You said that if we gave you one jellybean, you would solve our problem.”

“I just did,” said the turtle as he continued to walk away.

The lizard and the snake looked at their piles of jellybeans. The lizard had 3 jellybeans and the snake had 3 jellybeans. They happily ate their jellybeans, and soon all of the jellybeans disappeared.

The next day, the snake's mommy gave them a box of raisins to share. The lizard and the snake opened the box and counted seven raisins. “Oh no,” they said, and they went to look for the turtle to give him one of their raisins.

Tuesday, May 20, 2008

Designing a Data Structure

Students are rarely given an opportunity to design an algorithmically non-trivial data structure. We might give them a scenario and ask them to design how to represent the data, but that design decision is usually something like choosing between a hash table or an array or a linked-structure. Once that high level decision is made, there is very little left to design.

One reason for this is that many data structures depend on non-obvious invariants. Most students are pretty shaky on the whole idea of invariants to begin with; they're never going to spontaneously come up with something like the rules describing red-black trees or leftist heaps. With these kinds of data structures, we basically present the finished product and possibly ask them to implement and/or analyze the data structure, but we never ask them to design something like that.

For several years, I've been trying to give students a taste of real data structure design, using the sample priority queue below. I usually do this in class, where I can help steer them with gentle questioning, but I try very hard to make sure that they come up with the important parts.

I always do this at some point after the students have studied binary heaps (the kind typically used in heapsort). I briefly review the priority queue operations binary heaps support (insert, findMin, and deleteMin) and the basic idea of heap-ordered trees. Then, I introduce the idea of merging two heaps. With binary heaps, this takes O(N) time, or even O(N log N) time, if done naively. I ask them to design a new data structure to support merging in O(log n) time, while the other priority queue operations run in the same bounds as for binary heaps, O(1) for findMin and O(log N) for insert and deleteMin. I suggest that they use heap-ordered binary trees (real trees, rather than simulating trees using arrays like in binary heaps), but that they can modify these trees in any way that is helpful.

With heap-ordered trees, findMin is trivial—it just returns the root. To get their creative juices flowing, I then ask them to implement insert and deleteMin in terms of merge. This is also pretty easy: insert creates a new singleton tree and calls merge; deleteMin discards the root and calls merge on the two children of the root. So the whole problem boils down to merging two trees efficiently.

At this point, they usually flounder for a while, which is both ok and expected. Some floundering is good. Naturally, they'll start to get a little frustrated. That, too, is ok and expected. I just don't want them to get frustrated to the point where they shut down. If the floundering continues too long without any progress, I'll ask them what they can tell immediately about the result tree by only looking at the roots of the input trees. They'll quickly reply that the smaller of the two input roots becomes the root of the result tree. So I have them draw that much of the result. Depending on how things are going, I may further suggest that they draw two question marks for the children of the root in the result tree.

Then the students will play for a while with how to fill in those question marks. There are two slots for subtrees, but we have three subtrees that need to go in those slots: the two subtrees of the “winning” input root plus the “losing” input root. To fit three subtrees into two slots, naturally we need to merge two of them and leave one unchanged. But which two should be merged?

Usually, the students will begin with some fixed strategy, such as always merge the two subtrees of the winning root. But it's easy to come up with counter-examples where such a strategy will take O(N) time.

Once it becomes clear that some smarter strategy is required, the students focus on the question of which two of the three subtrees to merge. Usually the next thing they try is merging the two subtrees with the smallest roots, which fails on the same kinds of counterexamples as before.

Eventually, based on the intuition that merge should run faster on small trees than on big trees, they try merging the two smallest trees. This is always a good learning point about ambiguity, as we realize that there are three different possible meanings of “smallest”: the smallest root according to the ordering on keys, the smallest subtree according to number of nodes, or the smallest subtree according to the height of the subtrees. We've already shot down the first one, but either of the other two will work, so I just go with whichever one the class prefers (usually smallest in number of nodes).

One catch is that to efficiently compare sizes of subtrees (either number of nodes or height), we need to store the sizes at every node. But that's easy to do by adding an extra field, and maintaining that size information is also easy, albeit slightly easier for number of nodes than for height.

Typically, we don't have a lot of time left in class at this point, so I help them with the analysis, but it's easy to show that this design supports merge in O(log N) time. And there was much rejoicing!

I call this data structure maxiphobic heaps because of the way merge avoids the biggest subtree at every step. But I don't tell the students that name until after they've already come up with the design.

Tuesday, May 13, 2008

On Balanced Trees and Car Insurance

I usually don't teach our Data Structures course, but I often give a guest lecture about balanced binary search trees. This comes right after they've learned about plain (unbalanced) binary search trees. Often the previous lesson has ended with an illustration of what happens when values are inserted into the tree in sorted order.

After reviewing that pathological case and how it can easily arise in practice, I dive in and explain how to keep the trees balanced. In my case, I use red-black trees, but the exact choice of balancing scheme isn't critical to my point here. At the end of class, I ask them how much faster they think the balanced trees are than the unbalanced trees. They're shocked when I tell them that, most of the time, the balanced trees are slower than the unbalanced trees. As long as the data arrives in a reasonably random order, the plain binary search trees will actually end up being fairly balanced, without doing any extra work. No mucking around with colors, no rotations—of course the balanced trees are slower!

So why bother?

I tell them that it's like buying car insurance. When you buy car insurance, you hope that you're just throwing your money away. At the end of the year, if you haven't been in any accidents, you're happy to have wasted that money. Why bother? To protect against the uncommon case, to keep one unlucky event from turning into catastrophe. You pay a little bit extra all the time to keep from paying a lot extra every once in a while.

Friday, May 9, 2008

Barriers to Creativity

Last week, I had lunch with some friends of mine, all college-level instructors. I brought up the subject of creativity, which has been much on my mind lately. Our discussion highlighted some of the barriers to creativity in education, especially from the side of the instructor.

Creativity means artsy, doesn't it?

At one point, I asked the chemistry professor in our group about the role of creativity in the first few college-level chemistry courses. She was confused by the question at first, because she thought I meant something like writing poems about chemical reactions. (That reminds me of a scene in Lois McMaster Bujold's A Civil Campaign, in which a biochemist decides to rewrite the abstract to his dissertation in sonnet form. “Can you think of a word to rhyme with glyoxylate?”)

Actually, I meant creativity in the sense of creative problem solving. You know, the kind of thing MacGyver might do with 2 aspirins, a tube of superglue, and a Diet Coke. This misconception that creativity is the exclusive property of the arts is quite common. If we the instructors of techncial courses don't think of what we do as creative, then we are hardly likely to portray our discipline in a way that encourages creativity.

I'm using the word “arts” here to mean traditional arts like painting or poetry or music. I happen to think an elegant algorithm can be quite beautiful and artistic, but I don't expect a layperson to recognize it as art.

Foundations, Foundations, Foundations

A common theme in instructor comments was that students need to learn the foundations and basic skills of the discipline before they can do much that's creative. There's certainly some truth to this. But how long of an apprenticeship can we expect somebody to serve before finally exposing them to the beauty and wonder, the fun of the creative side? Do we save up all the creative parts for a senior-level capstone course (or even later), or can we intersperse skill development with appropriately-scaled opportunities to use those skills creatively?

A favorite movie of many teachers is The Karate Kid, especially the part where Mr. Miyagi teaches Daniel karate by having him do household chores such as waxing cars or painting fences. Eventually Daniel rebels at what he sees as pointless menial labor, and Mr. Miyagi demonstrates that Daniel has been learning valuable defensive moves all along. The point teachers usually draw from this example is that students can learn valuable lessons without realizing that they are learning, but there's another point here as well. What if, instead of confronting Mr. Miyagi, Daniel simply quit coming to his lessons? That's the situation we face as college teachers. If we give students too much skills development up front without any hint of a payoff, they'll simply stop taking our courses and never come back.

There's another danger as well. I saw this among my fellow students in graduate school. Students who had spent their entire undergraduate careers very successfully learning foundations and practicing skills were often lost when they were finally expected to do creative research. It was completely different from what they were used to, and often they found that they didn't like it.

Think of a piano teacher who makes her students practice nothing but scales for four years. By the time she feels her students are finally ready to play real music, she may find that the only students she has left are the ones who would rather play scales than anything else.'s hard!

Incorporating creativity into a college classroom can be hard, especially in large classes. The lecture format is particularly bad at nurturing creativity in the listeners, but a teacher facing several hundred students often has little choice.

Grading also becomes substantially more difficult if you design assignments that allow students room for creativity. Suddenly, there's no longer a single, “approved” solution that can be easily checked for. Instead, you need judgement to evaluate quite different solutions individually. This can be especially difficult in a course where TAs are expected to do most of the grading.

Fortunately, I suppose, I don't have large class sizes and I don't have any TAs, so I don't have those excuses to fall back on.

Friday, May 2, 2008

Beware Pseudo-Arrays

I'm always fascinated by the kinds of programming mistakes that novices make, and what they say about that student's state of mind. I particularly enjoy mistakes that aren't really mistakes, in the sense that the student has actually gotten the code to work, but where the code clearly demostrates some flaw in that student's mental model of programming. For example, see my previous post on Boolean confusion.

Here's another example involving arrays, or rather the lack of them. I was reminded of this example just yesterday when multiple teams tried to use this idea in a local programming contest.

Suppose you are rolling some standard six-sided dice, and keeping track of how many times each number has appeared. The most natural way to do this would be to use an array indexed from 1-6. In a language with zero-based arrays, you might use indices 0-5 and subtract 1 from each die value, or you might use indices 0-6 and simply ignore index 0.

However, many novices faced with this decision will reject arrays altogether and use six separate variables, perhaps named count1, count2, count3, count4, count5, and count6. (Or, if they are really gluttons for punishment, ones, twos, threes, fours, fives, and sixes.) I call these kinds of variables pseudo-arrays.

Why use pseudo-arrays? Because real arrays are scary. Plain variables and if-statements are much friendlier! After all, why write

when you could write
   if (dieValue == 1) {
   else if (dieValue == 2) {
   else if (dieValue == 3) {
   else if (dieValue == 4) {
   else if (dieValue == 5) {
   else {
Oh, and don't bother suggesting a switch-statement — if-statements are good enough, thank you very much.

Why do so many novices do this? In many cases, I think it is because they were first exposed to if-statements, and found them easy, and then later were exposed to loops, and found them difficult. Arrays are usually processed using loops, and are usually introduced while the beginner is still reeling from the trauma of loops. Thus, arrays become tainted and suspect. In such a novice's mind, the approach with variables and if-statements truly is easier, especially because so many of the if-statements can be programmed using cut-and-paste — the favorite editing commands of most novice programmers!

Every once in a while, I see code by somebody who has gotten over their fear of loops, but has not yet become reconciled with arrays. This can lead to gems where the student tries to loop through a pseudo-array, as in

   for (int d = 1; d <= 6; d++) {
      if (d == 1) {
      else if (d == 2) {
      else if (d == 3) {
      else if (d == 4) {
      else if (d == 5) {
      else {
As Dave Barry would say, I am not making this up.

Thursday, April 17, 2008

Games for Programmers: Jotto

The next in an irregular series of board game recommendations for programmers. Of course, you don't have to be a programmer to enjoy these games, but if you are a programmer, then I think there's a pretty good chance that you'll like them.

However, you might not want to play this particular game with your spouse. Here's why...

“End. E-N-D”, she guessed. “One”, I replied.
“Cab. C-A-B”, I guessed. “One”, she replied.

We were driving to a campground in Massachusetts. My wife and I were in the front seats of the minivan; our daughter was in the middle seat, watching a DVD; and our son and his friend were chatting in the back seat.

We had been on the road for several hours, and I was feeling a bit tired. I needed to do something to help keep me alert. “Jotto?”, I suggested to my wife. “Ok.“

“Icy. I-C-Y”, she guessed. “Zero”, I replied.
“Fed. F-E-D”, I guessed. “One”, she replied.

Jotto is a two-player deduction game, kind of like Mastermind but with words. In fairness, I should say that Mastermind is like Jotto but with colors, since Jotto is older than Mastermind.

Each player secretly chooses a five-letter word with no duplicate letters. The usual restrictions apply—no proper nouns, etc.

The players take turns guessing each other's words. For each guess, you tell the player how many letters were correct. Unlike Mastermind, it doesn't matter whether the letter is in the right position or not. For example, if you guessed BOARD and got one letter right, the answer might be PLAYS (the A matches, in the same place) but it could just as easily be GAMES (the A matches, not in the right place).

“Map. M-A-P”, she guessed. “One”, I replied.
“Jig. J-I-G”, I guessed. “Zero”, she replied.

Jotto is a favorite game for situations where it would not be feasible to set up a board or deal out cards. For example, you can easily play it in a restaurant, or waiting in line, or on a long drive. All you need is two pencils and two pieces of paper.

“Age. A-G-E”, she guessed. “Two”, I replied.
“Out. O-U-T”, I guessed. “Zero”, she replied.

In fact, you don't even need the pencil and paper! Most of the time, we play it in our heads with three-letter words instead of five. This is a great variation that allows the driver to play in the car. It also lets you play in situations where you are moving too much to be able to take notes comfortably, such as jogging or hiking.

For this game, I picked AXE as my word.

“Axe. A-X-E”, she guessed. “Got it. Dang that was fast!”, I replied.
“Men. M-E-N”, I guessed. “One”, she replied.

I started playing Jotto this way with my son when he was 10. We still enjoy it and play it often, but it has become kind of a running joke, because we almost always finish within one guess of each other. This is the first time I have played mental Jotto with my wife, and she's killing me! She got it in five guesses, and I'm not even close!

Ok, with ones for both FED and MEN, I guess that it probably has an E. Next, I want to figure out which letter from CAB was correct. I'll try the A, and I'll also throw one of the other letters from FED to help confirm that it really was the E.

“Far. F-A-R”, I guessed. “One”, she replied.

Good. It probably really was the A. What has an E and an A?

“Sea. S-E-A”, I guessed. “Two”, she replied.

Now I'm getting somewhere. It looks like the E-A is correct. What else has an E-A?

“Pea. P-E-A”, I guessed. “Two”, she replied.

Hmm. I'm nearly certain now that the E-A are correct. What else has an E-A? It can't be TEA because I already got a zero for OUT. What if it's A-E instead of E-A?

Wait a minute. No way. No, she couldn't possibly have picked...

“Axe. A-X-E”, I guessed. “Got it”, she laughed.

Oh, man. We really have been married too long.

Originally written as a session report for BoardGameGeek.

Monday, April 14, 2008


I recently stumbled across the TED video by Ken Robinson (via Joshua Gans). This video is almost two years old, so I'm getting to it rather late.

Robinson describes the increasing importance of creativity in the modern world, and bashes the education system for “educating people out of their creativity”. He contends that “creativity now is as important in education as literacy and we should treat it with the same status”. Robinson is an entertaining speaker, and I quite enjoyed his video. However, as I thought about it later, I become more dissatisfied.

I agree that creativity is important, and that the education system is doing a terrible job nurturing creativity. My problem with his talk is that he equates creativity with the arts: drawing, drama, dance, etc. I would hate to see some school administrator be inspired by the talk and attempt to “solve” the problem by simply adding an extra art class to the curriculum.

I'm not anti-arts. I'm not saying that art classes are bad. I am saying that giving students one art class in which to be creative, while systematically sucking the creativity out of them in every other class, is not going to solve the problem.

Instead, we need to recognize that other disciplines outside the arts also require creativity, and teach them that way. For example, mathematics is often taught in a way that brooks no creativity whatsoever, whereas the practice of actually doing real mathematics is highly creative. See Paul Lockhart's essay (pdf) for a wonderful discussion about creativity in mathematics education.

I think people often view technical subjects as having no place for creativity. You don't want a student getting creative when multiplying two numbers. I certainly don't want my programming students getting creative about indentation. In fact, in many technical subjects, the very word creative is often used as a synonym for wrong or mistaken (or sometimes unethical, as in “creative accounting”).

For example, I just said that you don't want a student getting creative when multiplying two numbers. But that's only because, in that context, “getting creative” would usually be interpreted as getting the wrong answer. In reality, I'd be thrilled if my grade-schooler, when faced with a multiplication problem like 37x999, rejected long multiplication and instead said, “Hmm. 999 is almost 1000. 37x1000 is 37000. Then take away 37 to get 36963.” That's creativity!

Perhaps the biggest difference between creativity in the arts and in technical subjects has to do with constraints. In the arts, you tend to have fewer constraints, and those constraints are often flexible. In technical subjects, you tend to have more constraints, and those constraints are often non-negotiable. Think of the scene from Apollo 13: “We've got to find a way to make into the hole for this...using nothing but that.” Or the t-shirt about the speed of light that says “300,000 kilometers per second: It's not just a good idea, IT'S THE LAW!”

Common perception is that, the more constraints you have, the less creativity is involved. I don't think that's true. In fact, as Marissa Mayer of Google fame says, creativity loves constraints. Of course, sometimes the creativity lies in figuring out which constraints are real, and which aren't.

Tuesday, April 8, 2008


Last week, I wrote about one of my more spectacular failures, which involved a progression that began 1, 2, 4, 8, 16, 32, but then rather than continuing sensibly as 64, 128, ..., instead exploded into 273, 65569, 4294967361, and 1.5x1082. The next term is a little over 265569 and the term after that a little over 24294967361, which has about a billion digits when written out in full.

Joshua Zucker suggested that I submit this to the On-Line Encyclopedia of Integer Sequences, which I did. So I've now been immortalized in Sequence A137181. I can see my tombstone now: “He failed...BIG.”

Several years ago I was giving a guest lecture in a colleague's Algorithms class, when I encountered immortality of a different sort. The students were asking me about how I invented “Okasaki trees”, and were dumbfounded when I didn't know what they were talking about. How could I not know about Okasaki trees when I invented them? “Well,” I told them, “I've invented a lot of different data structures, and most of them are trees of one kind or another. But I've never called any of them Okasaki trees.” After a bit more digging, it turned out that my colleague (who wasn't there that day) had been telling them the previous week about my simplification of red-black trees, and he called them Okasaki trees.

The lasting result of this is that I now have a friend at work who ribs me about Okasaki trees every chance she gets.

Friday, April 4, 2008

A Spectacular Failure

About five years ago, a mathematical puzzle occurred to me. I eventually solved the puzzle, but I don't want to talk about that solution here. Instead, I want to talk about my first attempt at solving the puzzle, which was an utter failure. A glorious, spectacular failure. Perhaps the single most impressive failure of my career. Failures are often much more interesting than successes, but for some unfathomable reason, people are often reluctant to discuss them.

The Puzzle and My “Solution”

You don't need to understand the details of the puzzle to enjoy the failure, so if you're not familiar with SK combinators and their ilk, or if you just can't be bothered, then please skip ahead to the next section.

I was playing around with SK combinators one day when a puzzle occurred to me. Could I introduce new combinators that would allow me to linearize any combinator expression? Combinator expressions are normally binary trees, and are written using parentheses to indicate the tree structure. The convention is that parentheses on the left can be omitted. For example, ((w x) y) z can be written as w x y z, but w (x (y z)) cannot be written without parentheses. I wondered if I could introduce new combinators that would allow me to rewrite any expression into an equivalent expression with no parentheses. I called such a transformation “flattening”.

One of the standard combinators in the SK family is B, which has the rewrite rule

B f g x = f (g x)
Reading this rule backwards works well for flattening small expressions—for example, x (y z) can be rewritten without parentheses as B x y z—but I ran into problems using it to flatten larger expressions, so I quickly turned to a variation of B that swaps the first two arguments
R f g x = g (f x)
This new combinator works well for flattening expressions of the form c (c1 c2 ... cN). For example,
   c (c1 c2 ... c7)
       = R (c1 c2 ... c6) c c7
       = R (c1 c2 ... c5) R c6 c c7
       = R (c1 c2 ... c4) R c5 R c6 c c7
       = R c1 R c2 R c3 R c4 R c5 R c6 c c7

Now, another combinator F works in conjunction with R to flatten more complicated expressions. F is defined as

F f g x y = f x (g y)
For example, we can now rewrite c1 c2 c3 (d1 d2 d3) as follows:
   c1 c2 c3 (d1 d2 d3)
      = F (c1 c2) (d1 d2) c3 d3
      = R c1 F c2 (d1 d2) c3 d3
      = F (R c1 F) d1 c2 d2 c3 d3
      = R (R c1) F F d1 c2 d2 c3 d3
      = R R R c1 F F d1 c2 d2 c3 d3

The complete rules for flattening are

  • flatten(c) = c
  • flatten(e c) = flatten(e) c
  • flatten(c1 (e c2)) = flatten(R e c1 c2)
  • flatten(e1 c1 (e2 c2)) = flatten(F e1 e2 c1 c2)
  • flatten(e1 e2) = flatten(flatten(e1) flatten(e2))
where each c is a simple symbol and each e is an arbitrary expression.

Not So Fast

I run my solution by hand on a few examples and it works great. So I code it up, run it on my examples and get the same answers. Yippee! Next, I run it on a larger example and.........nothing. The computer justs sits there, mocking me. Obviously, I've got a bug somewhere that is causing infinite recursion.

I stare at my program for awhile, but I'm not seeing any bugs. Hmmm. I run it on a different example and it finishes this time. I print out the result and...whoa, that's a lot of output! I wonder how big the transformed expressions are? Ok, calculating the size of the result shouldn't be too hard. I whip up another program and print out a table of the maximum output for an input of size N.

   N=1     max output=1
     2                2
     3                4
     4                8
     5                16
     6                32
Quick! Guess the next value in the table! 64, right? RIGHT?!
     7                273
Huh? 273? What the heck is that?
     8                65,569
65,569?! You're joking, right?
     9                4,294,967,361
4 billion? Well, that's just silly.
     10      15,177,100,720,513,508,366,

What do you even say to a number like that? That's 1.5x1082. To put that in perspective, the number of atoms in the observable universe is estimated to be about 1080. No wonder my program was running slow! At a few billion operations a second, that calculation should only take the 3...dang, where's Art Benjamin when you need him?...let's just say a long time.

If you're curious, here's the function that describes the size of the biggest possible output for an input of size N:

  • Biggest(N) = 2^(N-1), if N <= 6
  • Biggest(N) = 2^Biggest(N-3) + 2*Biggest(N-3) + 1, if N > 6
And here's the input of size 10 that gives the absurd result above:
X (X X) (X (X X) (X (X (X X))))

Once I recovered from the shock of this experience, I did eventually come up with a very nice solution to the puzzle, based on ideas from postfix languages like Forth and Postscript, but nothing quite compares with getting an answer like 1.5x1082.

Friday, March 28, 2008

Wrong Number

Is it time to overhaul the user interface of the phone system?

A few months ago we started getting a lot of calls that were wrong numbers. Before that we used to get maybe one a week, but suddenly we were getting 5 or 6 a day. After an initial period of frustration, we eventually started to apply our debugging skills.

Our first guess was that somebody was distributing our phone number on a flyer or something, either as a prank or by accident. But this didn't make sense because every caller was asking for somebody different. “Hello, is Juan there?” “Hi, Elizabeth/Gary/Tanya/Carlos?” Or the ever popular “<pause> Who is this?!!” No, dude, you called me, so you tell first.

Ok, scratch that idea. Hmm, I suppose we could always ask. “Sorry, wrong number, what number are you trying to reach?” At last, some progress. Our phone number is 987-654-3210 (not our real number). The people who were calling us were uniformly dialing 654-321-0xxx. The 654 area code is for an area maybe 20 miles away, so lots of people in our area code were dialing numbers there. Even worse, there are up to 1000 numbers in that area code that match this pattern. No wonder we were getting a lot of calls!

The reason they were getting us was that they were forgetting to dial a 1 first. Without the 1, the phone system was interpreting this as a local call within the 987 area code to 654-3210, with the extra three digits on the end simply being ignored. On many (most?) cell phones the 1 is not necessary, so people lose the habit of dialing it. Then when they use a land line, where the 1 is required, they get us.

With this insight, we contacted the phone company. It took three tries and many patient explanations to even get somebody to understand what the problem was. I asked what had changed in the system recently, because clearly something had changed to cause the sudden increase in wrong numbers.

“Uh, I don't know. I'll ask our tech guy.” A few minutes later. “Our tech guy says the 321 exchange in the 654 area code is a Cingular exchange. You should call them.” “But how is that supposed to help? The phone calls are originating in your system and never getting out of your system, so even if they made a change over there, it wouldn't change what's happening here.” “Well, you should call Cingular.”

Predictably, Cingular couldn't help. They couldn't even tell us if the 321 exchange was new because “our computers don't tell us that”.

The really frustrating thing about the problem is that it's so easy to fix. It should be easy for the system to detect when somebody has dialed a 10 digits but is being connected to a 7-digit number. Give the caller a warning and ask them to redial! It's possible this might break some systems with extensions, where the leftover 3 digits indicate the extension. So perhaps disable the warning if there is a pause between the 7th and 8th digits.

Has anyone else had this problem?

Wednesday, March 19, 2008

Program Testing For The Sake Of Learning

A number of years ago, I used to compete in the TopCoder on-line programming contests. As an educator, I was fascinated by the TopCoder environment, less by the contests themselves than by the so-called Practice Rooms. These were archives of problems from old contests, where you could submit solutions and run them against the test data used in the actual contest.

I was amazed by how hard many people would work in the Practice Rooms. I wished I could get my students to work half that hard! At first, I thought these people were motivated by the idea of doing better in the contest. And, indeed, that was probably what got most of them to come to the Practice Rooms in the first place. But then I noticed that many of the people working hard in the Practice Rooms competed only rarely, if at all. Something else was going on.

Eventually, it dawned on me that the secret lay in the ease of running system tests. As teachers, we know that feedback is crucial to learning, and that quick feedback is better than slow feedback. These programmers were getting feedback within a few seconds, while they were still “in the moment”. The experience of this tight, nearly instantaneous feedback loop was so powerful that it kept them coming back for more. The burning question then was could I re-create this experience in the classroom?

Why test?

Of course, program testing has been part of CS education forever. However, there are several different reasons for testing, and which reason or reasons the teacher has in mind will strongly affect how testing plays out in any particular class.

Here are several common reasons for testing:

Improved software quality. Ironically, the number one reason for using testing in the real world is the least important reason to use it in the classroom. Most student programs, especially in early courses, will never be used for real. What matters in these programs is not that students complete the program correctly, but what they learn in the process.

Easier grading. Grading programs can be hard, especially if you have a lot of students. Naturally, many CS educators seek to automate this process as much as possible. Many systems exist that will run student programs against a set of instructor tests and record the results. These results may used as a guide for human grading, or the results may entirely determine the student grade. When testing is performed as part of the grading process, it is common for the actual test cases to be hidden from the students.

To learn about testing. Of course, testing is a vital part of real-world software development. Therefore, it is important to expose students to the testing process so they have some idea of how to do it. However, care must be taken here, especially in early courses. Hand-crafting good test cases is hard work. If students are forced to fight through a heavy-weight testing process on toy problems, including making up their own tests, the lesson they frequently take away is that testing is much more trouble than it's worth.

Testing as design. In recent years, test-first and test-driven development have begun to spread from agile programming into the classroom. Advocates view such testing as part of the design process. By writing tests before writing code, students are forced to consider what the behavior of the program should be, rather than diving in without a clear idea of what the program is supposed to do.

Another way

All of these reasons are important, but none of them quite capture what I was seeing in the TopCoder practice rooms. There, people were becoming addicted to quick, easy-to-run tests as a powerful learning experience. This suggests viewing program testing as a way to improve learning. Not learning about testing, but learning about the content being tested.

In an attempt to re-create the relevant parts of the TopCoder experience in the classroom, I adopted the following process. With most programming assignments, I distribute some testing code and test cases. Students then run these tests while they are developing their programs, and they turn in their test results along with their code.

Note that this is an extremely lightweight process. All I'm doing is adding maybe 20 lines of test-driver code to the code I distribute for each assignment, and I can usually cut-and-paste that code from the previous assignment, with only a few tweaks. I then add maybe 50-100 test cases, either hardwired into the code or in a separate text file. I usually generate perhaps 5 of the tests by hand, and the rest randomly. Usually this process adds about 20-30 minutes to the time it takes me to develop an assignment.

I think CS teachers are often scared away by ideas like this because they imagine a fancy system with a server where students submit their programs, and the server runs the tests and records the results into a database. That all sounds like a lot of work, so teachers put it off until “next year”. In contrast, what I'm suggesting could probably be added without much hassle to an assignment that is going out tomorrow.

Similarly, teachers are sometimes scared away by the need to come up with the test suite, thinking it will take too much time to come up with a set of high-quality tests. But that's not what I'm doing. Experience has shown that large numbers of random tests work essentially as well as smaller numbers of hand-crafted tests, but at a fraction of the cost. (See, for example, the wonderful QuickCheck testing tool.) Besides, I'm not trying to guarantee that my test suite will catch all possible bugs, which is a hopeless task anyway. Instead, I'm providing a low-cost test suite that is good enough to gain most of the benefits I'm looking for.


And just what are those benefits? Well, since I started using this kind of testing in my homeworks, the general quality of submissions has gone up substantially. Partly, this is because of a drastic decrease in the number of submissions that are complete nonsense. When I did not require tests, students would write code that seemed correct to them at the time, and turn it in without testing it. Now, they can still turn in complete nonsense, but at least they'll know that they are doing so. Usually, however, when they see that they're failing every test, they'll go back and at least try to fix their code. In addition, many students who would have gotten the problem almost right without testing now get it completely right with testing.

Which leads me to the second benefit—there has been a noticeable improvement in the students' debugging skills. As one student put it, “[The testers] also taught us how to troubleshoot our problems. Without those test cases, many of us would have turned in products that were not even close to complete. This would have meant for worse grades and harder grading for the instructor. We also would not have been able to use and develop our troubleshooting skills. When we don't know that something is wrong, it is very hard to try to test for the failures.

However, the main benefit is that the students are learning more. They are getting needed feedback while they are still “in the moment”, and can immediately apply that feedback, which is all great for learning. Contrast this, for example, to feedback that is received after an assignment has been turned in. Often such feedback comes days or weeks later. Students often don't even look at such feedback, but even if they do, they have often lost the context that would allow them to incorporate the feedback into their learning. Even with an automated system that gives feedback instantly, if students do not have an opportunity to resubmit, then they will usually not bother to apply the feedback, and so will miss out on a substantial portion of the benefit to their learning.

As another student put it, “The automated testers are much more than just a means of checking the block to ensure the program works. They function almost as an instructor themselves, correcting the student when he or she makes a mistake, and reaffirming the student's success when he or she succeeds. Late at night, several hours before the assignment is due, this pseudo-instructor is a welcome substitute.


I have the students turn in the results of their testing, which makes grading significantly easier. Failed tests give me clues about where to look in their programs, while passed tests tell me that I can focus more on issues such as style or efficiency than on correctness.

But this is merely a fringe benefit. Note that I only use the test results to help guide my grading, not to determine grades automatically. I worry that when grading becomes the primary goal of testing, it can interfere with the learning that is my primary goal. For example, testing when used for grading often breaks the tight feedback loop that is where my students learn the most.

Also, when testing is used for grading, the instructor's test suite is often kept hidden from the students. In contrast, I can make my test suites public, which saves me all kinds of headaches. I'm not worried that a student might hardcode all the test cases into their code just to pass all the tests, because I'll see that when I look at their code. (In fact, this happens about once a year.) Students like having the test cases public because, if they don't understand something in the problem statement, they can often answer their own questions by looking at the test cases.

Admittedly, sometimes students take this too far. I occasionally catch them poring over the test cases like trying to read tea leaves, instead of coming to ask me a 30-second question.


I am not advocating that this approach to testing should be used everywhere in the curriculum. Among other things, I agree that students should have the experience of creating their own test cases. The question is when.

I see my approach as being most useful in beginning programming courses, and in courses that are not nominally about programming. For example, it works particularly well in Algorithms.

In other programming courses, however, I believe that a crawl-walk-run approach is appropriate. The “crawl” step is about attitude rather than skills; it is to convince students that testing is valuable. I believe my approach does this, especially if you occasionally leave out the tests and explicitly ask students to compare the experiences of developing with or without tests.

The “walk” step might be to have students generate their own tests, but using instructor-supplied scaffolding. The “run” step might be to have students generate both the tests and the scaffolding. I admit, however, that I have not taught the courses in which those steps would be most appopriate.

Tuesday, March 11, 2008

Games for Programmers: Schotten Totten/Battle Line

The next in an irregular series of board game (or, in this case, card game) recommendations for programmers. Of course, you don't have to be a programmer to enjoy these games, but if you are a programmer, then I think there's a pretty good chance that you'll like them.

In programming, the specifications you thought you were implementing are practically guaranteed to change by the time you're done. Unless you have the power to “just say no” to the changes, the best approach is often to stay flexible by delaying commitments for as long as possible. You try to keep your options open by putting off irreversible decisions to the last possble moment. Schotten Totten (by noted game designer Reiner Knizia) elegantly captures this feeling in a short, two-player card game.

box cover

Schotten Totten is nominally about two Scottish clans squabbling over territory. Nine boundary stones are placed between you and your opponent, in a line from left to right. You win by being the first to claim either three adjacent stones or five stones altogether. To claim a stone, you must play a better three-card group on your side of the stone than your opponent does.

game in progress
(photo by Scott Alden)

There are a total of 54 cards, numbered 1-9 in six different colors. Three-card groups are ranked as follows, from best to worst: straight flush (three consecutive numbers of the same color), three-of-a-kind (three of the same number), flush (three of the same color), straight (three consecutive numbers), “wild rabble” (anything else). If both players play the same kind of group, then each player adds their cards together and the player with the higher sum wins. If there is still a tie, the player who completed their group first wins.

You start the game with six cards. Each turn you play a single card on your side of one of the stones, and then draw a replacement card. (For beginners, the play-then-draw order can be a little confusing because most players are more accustomed to draw-then-play.) Towards the end of the game, if there are no more cards to draw, then you simply skip that step.

Once you have played a card on a stone, you're stuck with it. There is no way to take it back or replace it or cover it up. Most of the time you will not have all the cards that you want when you start playing a particular three-card group. The tension in the game comes from trying to avoid commiting to a particular kind of group for as long as possible. For example, suppose you've played the blue 6 on a stone, and now you have the blue 5 and the red 6 in your hand. If you play the blue 5 next, then you are commiting to trying for a straight flush. If you play the red 6 next, then you are commiting to trying for three-of-a-kind. The straight flush is stronger, but there are only two other cards in the deck that will complete the straight flush (the blue 4 or the blue 7). On the other hand, there are four other 6's in the deck, so you have a better chance of completing the three-of-a-kind. What you will probably do is delay playing on this stone by playing somewhere else, but that just means making a commitment there instead of here. If you do play on this stone, be prepared for the frustration of drawing exactly the card you needed for the group that you just ruled out!

(Technically, if you play the blue 5 on the blue 6, you are not commiting to trying for a straight flush. Any other blue will complete a flush, and any 4 or 7 will complete a straight. However, you almost never want a plain flush or a plain straight until close to the end of the game, because they are just too easy to beat.)

Each stone is awarded when both three-card groups are complete. However, there is a twist here that may appeal to programmers, who tend to excel at logical thinking. You can also claim a stone early if you can prove that your opponent can't beat your hand. For example, suppose you have three 7's on your side of a stone, and your opponent has the green 1 and the green 2 on his side. If the green 3 has already been played somewhere else, then there is no way your opponent can complete the straight flush. The best he could do would be a plain flush, which would lose to your 7's. Therefore, you can claim the stone without waiting for him to complete his group.

Note that the proof must be accomplished with cards that have already been played. In the above example, if you were holding the green 3, you could not claim the stone until you played it. Sometimes this means that you will play a card in a place where you would prefer not to, just to be able to claim a stone elsewhere.

Claiming stones early is important for two reasons. First, there are situations where both players can fulfill the victory conditions, so it is just a matter of who claims the stones first. For example, both players might be able to get three adjacent stones, or one might be able to get three adjacent stones while the other can get five stones altogether.

Second, and getting back to the theme of this recommendation, claiming a stone early helps you keep the pressure on your opponent. Once a stone has been claimed, neither player can play cards on that stone anymore, even if they have room to do so. You never want to give your opponent an easy place to dump a card. Not only does that let them rid themselves of a useless card—when you can only hold six cards at a time, freeing up an otherwise useless slot is huge—but more importantly it lets them avoid making a commitment someplace else.

How to get it

Schotten Totten can be hard to find in stores. Fortunately, its sister game, Battle Line is much more widely available. Battle Line is almost the same game, but with different artwork, with cards numbered 1-10 instead of 1-9, and with 7 cards per hand instead of 6. Also, Battle Line adds a separate mini-deck of so-called tactics cards, which let you change the rules in various ways. For example, one tactics card forces you to play 4 cards on each side of a particular stone instead of 3. (Actually, newer editions of Schotten Totten include the tactics cards as well.)

Personally, I like Schotten Totten better. I prefer the smaller, more claustrophobic scale. Also, I prefer the game without the tactices cards, which make the game too chaotic. Fortunately, if you get Battle Line you can experiment with or without the tactics cards, with our without the 10's, and with 7-card or 6-card hands, to see which way you like it better.

A good place to look for both games is, which summarizes prices and availability at a host of on-line game retailers.

If you want to try the game out before buying it, there are excellent implementations of Schotten Totten at both flexgames and (The latter site defaults to German but can be set to English by clicking on the flag.)

Thursday, March 6, 2008

Get the job done, but what is the job?

One of my most memorable conversations with a student happened a few years ago. I was grading a programming project, and something in one of the project documents struck me. At our school, students are required to document help from other students in some detail. One pair described getting help from another student, call him K. Apparently, K was telling them how to do something, but they weren't understanding him fast enough. He got impatient, grabbed the keyboard, and starting typing in the code for them.

I was flabbergasted by this description. How could K have possibly believed that this was appropriate? I knew the class that he was just about to get out of, so I met him there and took him into an empty classroom.

For at least five minutes, we talked completely past each other. He didn't understand why I was upset, and I didn't understand why he seemed to believe he had done a genuinely good thing. Obviously, there were some unspoken assumptions on both sides that were preventing us from communicating.

Eventually, I realized what was going on. He had been taught that the bottom line was getting the job done. That was the most important thing, and he believed that what we teachers wanted from students was for them to get the job done. Helping his buddies get the job done was, in his eyes, no less than the responsible thing to have done, a positive act of virtue.

Aha! With this realization, and with his phrase of “getting the job done”, I finally had the wedge I needed to get through to him.

“What was the job on this project?”, I asked him.

“To complete a working program”, he replied.

“No, it wasn't”, I said. Utter confusion covered his face. “Think about it. I already have a solution to this project. Why would I need a dozen student implementations? I'm pretty sure that mine is going to be less buggy, faster, and better documented than any of yours.”

“But...” The first signs of doubt.

“So, if I don't really care about your implementations, then why did I have you all do this project? What do I care about? What was the job?

K frowned in concentration. It's never an easy thing when somebody tries to dismantle your basic assumptions. To his credit, he didn't just shut down, but actively struggled to understand what I was saying. He started to speak and stopped a few times. He was almost there, but couldn't quite articulate it.

“Learning”, I finally said. “The job was learning. The implementation was only a way to trigger that learning. But the implementation doesn't really matter, it's the learning that I care about.”

He got it. He understood why, from my point of view, typing in code for the other students was not helping them get the job done, but instead was actively hindering the learning that was the real job.

As we wound down, he mentioned a different teacher he had had the previous two semesters. The other teacher had a grading policy where you get credit for an assignment only when it is completely correct. If your assignment has flaws, then you redo it until it's correct, losing points with each resubmission. For somebody like K, who strongly believes in getting the job done, this policy just reinforces the idea that finishing the program is the job. I know this wasn't the other teacher's intent, but I can certainly see how it could be taken that way.