Efficiency of Pairing heap

Last time, I introduced the Pairing heap, and showed an example of its structure and how it changes as items are added and removed. At first glance it seems unlikely that this can be faster than the binary heap I spent so much time exploring three years ago. But it can be. Let me show you how.

As I mentioned in the discussion of the binary heap, inserting an item takes time proportional to the base-2 logarithm of the heap size. If there are 100 items in the heap, inserting a new item will take, worst case, O(log2(100)) operations. Or, about 7 swaps. It could be fewer, but it won’t be more. In addition, removing the minimum item in a min-heap will require O(log2(n)) operations. In a binary heap, insertion and removal are O(log n).

As you saw in my introduction to the Pairing heap, inserting an item is an O(1) operation. It never involves more than a comparison and adding an item to a list. Regardless of how many items are already in the heap, adding a new item will take the same amount of time.

Removal, though, is a different story altogether. Removing the smallest item from a Pairing heap can take a very long time, or almost no time at all.

Consider the Pairing heap from the previous example, after we’ve inserted all 10 items:

    0
    |
    6,4,3,5,8,9,7,2,1

It should be obvious that removing the smallest item (0) and re-adjusting the heap will take O(n) operations. After all, we have to look at every item during the pair-and-combine pass, before ending up with:

    1
    |
    2, 8, 3, 4
    |  |  |  |
    7  9  5  6

But we only look at n/2 items the next time we remove the smallest item. That is, we only have to look at 2, 8, 3, and 4 during the pair-and-combine pass. The number of items we look at during successive removals is cut in half with each removal, until we get to two or three items per removal. Things fluctuate a bit, and it’s interesting to write code that displays the heap’s structure after every removal.

By analyzing the mix of operations required in a very large number of different scenarios, researchers have determined that the amortized complexity of removal in a Pairing heap is O(log n). So, asymptotically, removal from a Pairing heap is the same complexity as removal from a binary heap.

It’s rather difficult to get a thorough analysis of the Pairing heap’s performance characteristics, and such a thing is way beyond the scope of this article. If you’re interested, I suggest you start with Towards a Final Analysis of Pairing Heaps.

All other things being equal, Pairing heap should be faster than binary heap, simply because Pairing heap is O(1) on insert, and binary heap is O(log n) on insert. Both are O(log n) on removal. Keep in mind, though, that an asymptotically faster algorithm isn’t always faster in the real world. I made that point some time back in my post, When theory meets practice. On average, Pairing heap does fewer operations than does binary heap when inserting an item, but Pairing heap’s individual operations are somewhat more complex than those performed by binary heap.

You also have to take into account that Pairing heap can do some things that binary heap can’t do, or can’t do well. For example, the merge (or meld) function in Pairing heap is an O(1) operation. It’s simply a matter of inserting the root node of one heap into another. For example, consider these two pairing heaps:

    1                  0
    |                  |
    2, 8               5
    |
    4

To merge the two heaps, we treat the root nodes just as we would two siblings after removing the smallest item. We pair them to create:

    0
    |
    1, 5
    |
    2, 8
    |
    4

With binary heap, we’d have to allocate a new array that’s the size of both heaps combined, copy all items from both heaps to it, and then call the Heapify method to rebuild the heap. That takes time proportional to the combined size of both heaps.

And, as I mentioned previously, although changing the priority of an item in a binary heap isn’t expensive (it’s an O(log n) operation), finding the node to change is an O(n) operation. With Pairing heap, it’s possible to maintain a node reference, and it’s been shown (see the article linked above) that changing the priority of a node in a Pairing heap is O(log log n).

Another difference between Pairing heap and binary heap is the way they consume memory. A binary heap is typically implemented in a single contiguous array. So if you want a heap with two billion integers in it, you have to allocate a single array that’s 8 gigabytes in size. In a Pairing heap, each node is an individual allocation, and each node contains two pointers (object references). A Pairing heap of n nodes will occupy more total memory than a binary heap of the same size, but the memory doesn’t have to be contiguous. As a result, you might be able to create a larger pairing heap than you can a binary heap. In .NET, for example, you can’t create an array that has more than 2^31 (two billion and change) entries. A Pairing heap’s size, however, is limited only by the available memory.