Thursday, October 22, 2009

Binomial queues as a nested type

Warning: this is probably only of interest to functional programmers.

Maciej Kotowicz recently asked on the Haskell Cafe mailing list about implementing binomial queues (also known as binomial heaps) using fancy type machinery so that the type-checker can enforce the shape invariants of the data structure. This reminded me of a discussion I had with some colleagues in the summer of 1998 about using nested types to enforce these kinds of invariants. A little digging around in mail archives yielded this email, presented here with some light formatting and a few fixed typos.

I've been playing with binomial queues as a nested type, and I think you'll find the end result interesting.

REVIEW

Let me first quickly review the usual implementation of binomial queues. Recall that a binomial tree has the shape

data Tree a = Node a [Tree a]
A binomial tree of rank k has k children, of ranks k-1 .. 0 (stored in the list in decreasing order of rank). Note that we can combine two heap-ordered binomial trees of rank k to get a heap-ordered binomial tree of rank (k+1) as follows:
combine a@(Node x xs) b@(Node y ys)
   | x <= y    = Node x (b : xs)
   | otherwise = Node y (a : ys)
Now a binomial queue is a list of heap-ordered binomial trees of increasing height (but not necessarily of consecutive ranks). We'll represent the ranks of those trees that are present by their position in the tree. Thus, some ranks will be empty. This could be implemented as
type Binom a = [Maybe (Tree a)]
or somewhat more efficiently as
data Binom a = Nil
             | Zero (Binom a)
             | One (Tree a) (Binom a)
or better still by unboxing the Tree in the One constructor
data Binom a = Nil
             | Zero (Binom a)
             | One a [Tree a] (Binom a)
I won't go over all the operations -- they are amply covered elsewhere. I'll just describe two functions. The first, add, takes a tree and a list (where the tree has the same rank as the first position in the list), and returns a new list. It works basically like the increment function on binary numbers.
add :: Ord a => a -> [Tree a] -> Binom a -> Binom a
add x xs Nil = One x xs Nil
add x xs (Zero h) = One x xs h
add x xs (One y ys h)
  | x <= y    = Zero (add x (Node y ys : xs) h)
  | otherwise = Zero (add y (Node x xs : ys) h)
The merge function is similar and works basically like the addition function on binary numbers.

Finally, the getMin function returns the minimum element of a queue paired with the queue without that element. The helper function getMin_ returns a triple of the minimum element in some suffix of the queue, the list of children associated with that minimum element, and the suffix without that element.

getMin_ :: Ord a => Binom a -> (a, [Tree a], Binom a)
getMin_ (Zero h) = case getMin_ h of
                     (y,ys,h') -> (y,ys,Zero h')
getMin_ (One x xs Nil) = (x,xs,Nil)
getMin_ (One x xs h) = case getMin_ h of
                         (y,ys,h') | x <= y    -> (x,xs,Zero h)
                                   | otherwise -> (y,ys,One x xs h')

getMin :: Ord a => Binom a -> (a, Binom a)
getMin h = let (x,xs,h') = getMin_ h
           in (x,merge (list2binom xs Nil) h')

list2binom [] h = h
list2binom (Node x xs : ys) h = list2binom ys (One x xs h)
Note that when getMin receives a list of children, it converts them to a valid binomial queue by reversing the list (and changing each pair of a Node constructor and a (:) constructor to a One constructor).

NESTED REPRESENTATION -- FIRST ATTEMPT

By following the same approach we have been using to design nested datatypes, we get the following representation of binomial queues.

data Binom_ a b = Nil
                | Zero (Binom_ a (Trees a b))
                | One a b (Binom_ a (Trees a b))

type Trees a b = (a, b, b)

type Binom a = Binom_ a ()
Here the list of children
  (Node x3 xs3 : Node x2 xs2 : Node x1 xs1 : Node x0 xs0 : [])
is being represented by the nested triples
  (xs3,xs3', (x2,xs2', (x1,xs1', (x0,xs0', ()))))
(where xsN' is the appropriate conversion of xsN).

All the functions are perfectly straightforward, except for getMin and getMin_ which must incrementally build up the reversing function to convert

  (xs3,xs3', (x2,xs2', (x1,xs1', (x0,xs0', ()))))
to
  One x0 xs0' (One x1 xs1' (One x2 xs2' (One x3 xs3' Nil)))
As a result of building up this reverse function incrementally, this implementation ends up being roughly 10% slower than the original.

INTERLUDE ON REVERSING LISTS

Suppose we want to support the following ADT. We want sequences with three operations:
empty :: ReversableSeq a
cons  :: a -> ReversableSeq a -> ReversableSeq a
rev   :: ReversableSeq a -> [a]
with the obvious semantics. cons must take O(1) time, but rev may take O(n) time. A list is a reasonable implementation with
type ReversableSeq a = [a]
empty = []
cons = (:)
rev = reverse
but, if you think about it, there's a fair amount of interpretive overhead in the pattern matching that reverse performs at every step. Way back in 1985, John Hughes came up with a representation that avoids this overhead:
type ReversableSeq a = [a] -> [a]
empty = id
cons x xs = xs . (x:)
rev xs = xs []
The result of cons 1 (cons 2 (cons 3 empty)) is
id . (3:) . (2:) . (1:)
which, when applied to [] by the rev function, yields [3,2,1]. One way to think of this representation is as lists with "holes" at the ends. The cons operation fills in this hole with an element and another hole, and the rev operation fills in the hole with []. (This is quite similar to difference lists in logic programming...)

Applying this trick to the regular representation of binomial queues yields

data Binom a = Nil
             | Zero (Binom a)
             | One a (Binom a -> Binom a) (Binom a)
which turns out to be about 10% faster than the original representation.

NESTED REPRESENTATION -- SECOND ATTEMPT

Ok, let's try to make a nested datatype out of this. Conceptually, the nesting should keep track of our position in the queue so that we know what rank the current tree has and can't possibly mix up trees of different ranks. We hypothesize some Base type for the beginning of the list, and some type transformer Succ that modifies the type as we move down the queue.

type Base = ???
type Succ b = ???  -- this might need to be Succ a b

type Binom a = Binom_ a Base

data Binom_ a b = Nil
                | Zero (Binom_ a (Succ b))
                | One a (Binom_ a ??? -> Binom_ a ???) (Binom_ a (Succ b))
Now, what should the missing types in the One constructor be? Well, the function (Binom_ a ??? -> Binom_ a ???) represents the children of the current node. If this node is of rank k, then these children are of ranks 0 .. k-1. The function (Binom_ a ??? -> Binom_ a ???) takes a queue starting at rank k and adds the children to the front of it, yielding a queue starting at rank 0. So the ???'s can be filled in as
                | One a (Binom_ a b -> Binom_ a Base) (Binom_ a (Succ b))
or just
                | One a (Binom_ a b -> Binom a) (Binom_ a (Succ b))
Now, what should the types Base and Succ be? It doesn't matter because we never construct data of those types, we simply use the types to discriminate between positions in the queue. So we can define Base and Succ as
type Base = Void
newtype Succ b = S Void
I think this is a very interesting type for several reasons:
  • It includes an arrow type, which we haven't seen much.
  • The RHS of the datatype definition (in fact, just the One constructor) has occurrences of Binom_ at no fewer than three different types
    Binom_ a b
    Binom_ a Base
    Binom_ a (Succ b)
    
    I've seen two before, but not three.
  • It includes types which are never used, but are merely there to capture certain invariants (in this case, that trees of different ranks should never be confused). [2009 Comment: Today, these are called phantom types.] In some ways this might make this type less interesting, but an interesting consequence is that the code satisifes a certain "type erasure" property: if you erase the types from all the functions, you get code for the optimized regular representation.
Because of this "type erasure" property, you might expect it to be just as fast as the optimized regular representation, but in fact it is roughly 10% slower -- or roughly as fast as the original regular representation. This is because the extra types hinder a certain optimization that the programmer might make (and which I did make in the optimized regular representation).

Recall the type of the getMin_ function from the original regular representation, and the way it was used by getMin.

getMin_ :: Ord a => Binom a -> (a, [Tree a], Binom a)

getMin :: Ord a => Binom a -> (a, Binom a)
getMin h = let (x,xs,h') = getMin_ h
           in (x,merge (list2binom xs Nil) h')
For the optimized regular representation, this becomes
getMin_ :: Ord a => Binom a -> (a, Binom a -> Binom a, Binom a)

getMin :: Ord a => Binom a -> (a, Binom a)
getMin h = let (x,xs,h') = getMin_ h
           in (x,merge (xs Nil) h')
Now, consider the optimized nested representation. We can't just write
getMin_ :: Ord a => Binom_ a b -> (a, Binom_ a b -> Binom a, Binom_ a b)
because we don't know the rank of the tree that the second component of the triple came from. (Note that if we had existential types, we could write
getMin_ :: Ord a => Binom_ a b -> 
                    (a, exists c. Binom_ a c -> Binom a, Binom_ a b)

getMin :: Ord a => Binom a -> (a, Binom a)
getMin h = let (x,xs,h') = getMin_ h
           in (x,merge (xs Nil) h')
Some implementations do in fact support existential types, but in limited ways that would carry efficiency costs of their own...)

Instead of returning the children and then applying them to Nil at the end, we apply the children to Nil first, and then return the result, yielding the type

  
getMin_ :: Ord a => Binom_ a b -> (a, Binom a, Binom_ a b)

getMin :: Ord a => Binom a -> (a, Binom a)
getMin h = let (x,xs,h') = getMin_ h
           in (x,merge xs h')
This depends on laziness to evaluate only the children of the final minimum, and not all the children of the temporary minimums along the way. However, this does mean that we build a lot more thunks of the form (xs Nil) along the way. And building these thunks is apparently quite expensive.

2009 Addendum Ralf Hinze considers a similar representation in Section 6 of Numerical Representations as Higher-Order Nested Datatypes. Today, you would probably use fancier type machinery like dependent types or maybe GADTs, but it's still amazing how far you can get with only nested types.

6 comments:

stick said...

My brain is goo.

So, is there anything (book, website, etc) that you might suggest for someone who hasn't looked at algorithms (or much code, for that matter) in a year and wants to remember all of what they once knew?

Chris Okasaki said...

Stick, you're welcome to come back and sit in on my class any time. :-)

For relearning some algorithms, I would suggest actually writing some instead of just reading about them. Something like Project Euler might be a good start, perhaps eventually moving on to TopCoder or the Facebook programming puzzles. The practice rooms in TopCoder might be a especially helpful because you can work at your own pace, and there's a ton of problems.

David Feuer said...

I'm one of those people who thinks programming these kinds of constraints in the type system is very cool (though I can't say I know how to come up with these data structures, or understand the types as well as I'd like), but on the practical side I don't really see the point. Typically, the types are complicated enough that it seems at least as hard to be sure you've constructed them right as it would be to prove your algorithm's correctness manually. Add to that the complexity of writing functions to manipulate these types and the fact that they seem to tend to give worse performance than non-nested versions, and you have a trifecta of "why bother". Have you any thoughts of whether these approaches are, or likely will be, practical for some applications? [Side note: I think your paper on square matrices was one of the most mind-blowing/expanding ones I've ever read]

Chris Okasaki said...

@David: I agree that they're not particularly practical in most cases, especially today when things like dependent types and theorem provers are much more common. In particular, I wouldn't use nested types like this in production code for something like a priority queue that I expected to hide behind an abstraction barrier. One place where they might be practical (although I can't claim to have ever done this myself) is if you have a central data structure with non-trivial invariants where you expect many people (maybe even users!) to be working with the raw representation, not hidden behind an interface. In such a case, it may be unreasonable to expect users to do anything with dependent types or theorem provers, yet you would still like to generate type errors if they dork up the invariants.

David Feuer said...

Semi-relatedly, I asked a question about the non-nested implementation over at Stack Overflow that no one's answered yet.

David Feuer said...

The pqueue package on Hackage actually uses a nested approach. The author says he was able to use GHC's UNPACK pragma to make the nested representation substantially smaller than the uniform one, and therefore faster. This unpacking is only possible with the nested type, because unpacking requires a single-constructor (tuple-like) type.