A simple heap of integers

I introduced the binary heap data structure in my previous post by proposing a more efficient way to manage a collection of cubes. If you recall, I said that any child could learn to maintain a binary heap and that implementing one in code is almost as easy. This time I’ll prove it by creating a rudimentary binary heap class that will store integers. In a later post I’ll show a complete heap implementation that supports generics.

A binary heap is a binary tree with two additional constraints:

  1. It satisfies the shape property. That is, it’s a complete binary tree. All levels of the tree, except possibly the lowest level, are fully populated. If the last level is not fully filled, the nodes are filled from left to right.
  2. It satisfies the heap property. Any node in the tree is less than or equal to its children. As a result, the node with the minimum value is always at the root of the tree.

Rule 2 is the rule for what we call a Min-heap. You can also create a Max-heap in which the root is the node with the maximum value, and each node’s value is greater than or equal to its children. The difference in code between a Min-heap and a Max-heap is minimal: you just change the sense of a few comparison operators. I’m going to focus on Min-heap here, with the understanding that everything applies equally to Max-heap. In a later post I’ll show how to use the same code to implement both.

I’m making a change to the way I show the heaps. In the last post I just showed the items arranged in a pyramid. I did that because I didn’t want to confuse the issue with a technical drawing. From now on I’m going to use graphs generated by Graphviz. It’s much easier on me, and more of what you’d expect from an article about algorithms. The last heap shown in my previous post, then, looks like this.

You can see that this tree satisfies both rules. All levels except the last are filled, and the single node on the last level occupies the leftmost position. In addition, every node is smaller than its children. That is, 3 is smaller than 11 and 5. 11 is smaller than 38 and 23, etc.

A sometimes confusing property of heaps is that for any given n greater than 2, there are multiple valid heap arrangements. Here are two other valid arrangements of those same eight items.

Both of those are valid heaps because no node is smaller than its parent, and the lowest level is left-filled. If you were to apply the item removal rules repeatedly to each of those heaps, the items would come out in the proper order: 3, 5, 11, 23, 37, 38, 41, 50. If you don’t believe me, sit down with a piece of paper and give it a whirl.

Because each level of the binary heap contains twice as many nodes as the previous level, it’s very easy to represent a binary heap in an array. Consider this image, in which I’ve labeled the heap nodes from 0 to 14.

Each heap node is labeled with its corresponding index in an array, with the rood node at position 0. The children of node 0 are 1 and 2. The children of 1 are 3 and 4. The children of 5 are 11 and 12. Given a node’s index, x, its child node indexes are (2*x)+1 and (2*x)+2.

Similarly, a node’s parent is at trunc((x-1)/2). So the parent of node 12 is 5, and the parent of node 13 is 6.

The previous heap shown above can be represented in an array as: [3, 23, 5, 37, 41, 11, 50, 38].

Given that, let’s take a look at how we’d implement a binary heap in C# code. Again, the code we’re developing here is designed to create a Min-heap of integers. We’ll extend it later. For now we want to get the basic idea down. Rather than an array, I’ll use a List<T> as a backing store. That gives us array-like functionality with the addition of automatically resizing when we go off the end. The Insert and Remove methods (and eventually all of the heap methods) operate on this list.

    private List<int> _items = new List<int>();

If you remember the rules from the last post, inserting an item involves placing that item at the end of the heap and bubbling it up through the nodes to its final resting place. Removing the smallest node involves placing the last item in the heap at the newly vacated root position and sifting it down through the heap to its proper position. Those two functions, BubbleUp and SiftDown are the critical operations in the heap.

The code to insert an item is a direct translation of the rules we outlined last time:

  1. Add the item to the first empty spot in the structure. That is, the lowest, leftmost unoccupied position.
  2. Bubble up. While the item is smaller than its parent, swap the item with its parent. If the node gets to the top of the tree, stop. The root has no parent.
    public void Insert(int newItem)
    {
        // Add item to the end of the list.
        _items.Add(newItem);

        // Get index of the newly added item.
        int ix = _items.Count - 1;

        BubbleUp(ix);
    }

    private void BubbleUp(int ix)
    {
        // While item is smaller than its parent, bubble up.
        while (ix > 0 && _items[ix] < _items[(ix - 1)/2])
        {
            // Swap item with its parent.
            Swap(ix, (ix - 1)/2);

            // Adjust index
            ix = (ix - 1)/2;
        }
    }

    private void Swap(int x, int y)
    {
        int temp = _items[x];
        _items[x] = _items[y];
        _items[y] = temp;
    }

Removing an item is similarly simple. Remember the rules:

  1. Remove the item in the first position. This is the smallest item.
  2. Move the last item in the structure to the first position.
  3. Sift down. While the item is larger than the smallest of its children, swap with the smallest child. If the item has no children, stop.

The code is a direct translation:

    public int Remove()
    {
        // Can't remove an item that isn't there.
        if (_items.Count == 0)
        {
            throw new InvalidOperationException("The heap is empty!");
        }

        // Get the first item.
        int smallest = _items[0];

        // Copy the last item to the first position.
        _items[0] = _items[_items.Count - 1];

        // Remove the last item
        _items.RemoveAt(_items.Count - 1);

        if (_items.Count > 0)
        {
            SiftDown(0);
        }
        return smallest;
    }

    private void SiftDown(int ix)
    {
        // We can stop when ix is past the halfway point
        // because all items beyond this point have no children.
        while (ix < _items.Count / 2)
        {
            // find the smallest child
            int ixChild = (ix * 2) + 1;
            if (ixChild < _items.Count - 1 && _items[ixChild] > _items[ixChild + 1])
            {
                ixChild++;
            }

            // If the item is <= the smallest child, we're done.
            if (_items[ix] <= _items[ixChild])
            {
                break;
            }

            // Swap the item with the smallest child.
            Swap(ix, ixChild);

            // And adjust the index.
            ix = ixChild;
        }
    }

With the addition of a Count property so that you can tell how many items are in the heap, and a Peek method that lets you examine the smallest item, you have a minimally functional binary heap class. Granted, it only works for integers, but you have to start somewhere.

There is one other function that a heap really should support: quickly initializing a heap from a collection. Typically that’s done with a constructor overload, like this:

    public BinaryMinHeap(IEnumerable<int> collection)
    {
    }

The simple-minded way to initialize a heap with a collection is to call Insert for each item in the collection. So, for example, the code inside the constructor above might be:

    foreach (int i in collection)
    {
        Insert(i);
    }

That will work, but it’s expensive. Every item that added is bubbled up from the bottom. Imagine starting with an empty heap and adding 127 items in reverse order (i.e. 127, 126, 125, 124, etc.). Each new item is smaller than all the other items, so every item will require the maximum number of swaps to bubble up from the last position to the top. Think of it this way. The first item that’s added makes zero swaps. The next two items make one swap each. The next four items make two swaps each. Eight items make three swaps. 16 items make four swaps. 32 items make five swaps, and 64 items make six swaps.It works out to:

    0 + 2*1 + 4*2 + 8*3 + 16*4 + 32*5 + 64*6
    0 + 2 + 8 + 24 + 64 + 160 + 384 = 642 swaps

There is a better way. It involves first copying the array to the _items list. Then, starting in the middle of the list and working backwards to the head of the list, sift items down.

We can start at the midpoint because in any heap, half of the nodes are leafs: they have no children. If a node has no children then it makes no sense to try sifting it down. So right from the start, 64 of the items make no swaps at all. Remember, we’re working backwards so the next 32 items make one swap each. 16 items make two swaps, eight make three, four make four, two make five, and one makes six. The progression looks like this:

    64*0 + 32*1 + 16*2 + 8*3 + 4*4 + 2*5 + 1*6
    0 + 32 + 32 + 24 + 16 + 10 + 6 = 120 swaps

Note that those are both worst case scenarios. And, yes, the operation works for any arrangement of items.

The resulting heap is not the same as if you had called Insert for every item added, but it’s still a valid heap.

Creating a heap by inserting n items will take time proportional to n*log2(n). Creating a heap by copying all the items into the list and rearranging as described above takes time proportional to n. You can see from the calculations above that the difference is large even with just 128 items. If you’re initializing a heap of one million items, inserting items individually will do on the order of 20 million swaps. Building the heap the other way will do approximately one million swaps; it’s 20 times as fast.

The operation that turns a random list into a heap is typically called Heapify. It’s incredibly simple:

    private void Heapify()
    {
        for (int i = _items.Count/2; i >= 0; --i)
        {
            SiftDown(i);
        }
    }

The constructor, then, creates a new list from the passed collection and then call Heapify, like this.

    public BinaryMinHeap(IEnumerable<int> collection)
    {
        _items = new List<int>(collection);
        Heapify();
    }

That’s it for now. Next time I’ll show a few examples of how to use the binary heap.

A better way to do it: the heap

Last time, I used a hypothetical work assignment to introduce the concept of a priority queue. Let’s continue with that for a little longer.

You had been working at your new job for a few hours when your friend the computer programmer came by. Seeing you struggling, he asked you to describe what you were doing. He thought for a minute and then proposed a new solution. Rather than laying the cubes out in a line, he suggested that you build a pyramid, like this:

              03
          04      05
        11  23  37  41
      50

All he did was take the sorted list and turn it into a pyramid with the smallest cube at the top and the other cubes filling the levels from left to right, in order. Then, he gave you some simple rules to ensure that, when you remove or insert a cube, the smallest cube ends up at the top.

Here are the rules for removing the smallest cube.

  1. Pull the item from the top of the pyramid. That leaves you an empty space, represented by XX.
                XX
    04 05
    11 23 37 41
    50
  2. Take the last item in the pyramid (the right-most cube on the lowest level) and place it in the top spot, giving you:
          50
    04 05
    11 23 37 41
  3. While the cube is larger than the smallest of the two cubes directly below it (its children), swap the item with its smallest child. Stop if the item moves to the bottom level (the leaf level) because it has no children. In this case, 50 is larger than 4, so we swap.
          04
    50 05
    11 23 37 41

    50 is still larger than its children, so we need to swap again, resulting in this.

          04
    11 05
    50 23 37 41

    And we’re done because the node is now at the leaf level.

If you always follow those rules, the smallest item will always end up on the top. For example, if you removed the smallest now, you’d get this progression:

            41          Move 41 to the top spot
        11      05
      50  23  37
      --------------
            05          Swap 41 with 05
        11      41
      50  23  37
      --------------
            05          Swap 41 with 37
        11      37
      50  23  41

That takes care of removing an item. What about adding an item? That has some simple rules, too.

  1. Add the item to the first empty spot in the structure. That is, the lowest, leftmost unoccupied position. Let’s say we want to add a cube with the number 3 to the structure directly above, resulting in:
          05
    11 37
    50 23 41 03
  2. While the item is smaller than its parent (the cube above it), swap the item with its parent. Obviously we’ll stop if the item gets to the top, because the root has no parent. In this case, 03 is smaller than 37, so we end up with:
          05
    11 03
    50 23 41 37

    03 is smaller than 05 (its new parent), so we swap again, leaving us with:

          03
    11 05
    50 23 41 37

To add a cube with number 38, you’d add it to the structure below and to the left of 50. 38 is less than 50, so you’d swap the two cubes and you’d be left with:

              03
          11      05
        38  23  41  37
     50

If you always follow those rules for adding and removing, the smallest cube will always be on the top and you will always be able to do quick additions and removals.

That structure looks like it’s in crazy order, I know. But if you play with it a little more, adding and removing items, you’ll find that it really does work. It would take you maybe an hour of practice in applying the rules before you’re ripping right along, easily keeping up with the people coming by to drop cubes on your counter or to get the smallest one. It’s a lot faster.

When you remove a cube you take the last cube from the structure, place it at the top, and sift it down. It should be obvious that you can’t sift down more levels than there are in the structure. So sifting the cube down will never take more operations than there are occupied levels. In that first example there were three levels after we moved 50 to the top spot, so it couldn’t have required more than three swaps to move the cube into its final position.

Adding an item is similar. When you add a cube, you bubble it up. You can’t bubble up more levels than there are in the structure, so adding an item will never require more operations than there are occupied levels.

So the number of levels is the deciding factor. If there are levels, it will never take more than n swaps to add or remove an item.

With each level containing twice as many cubes as the previous level, it takes very little effort to work with a fairly large number of cubes. This table shows how many items fit into a structure of a given height.

Levels Total cubes
11
23
37
415
531
663
7127
8255
9511
101023

You still need lots of horizontal space, but only half as much as you would with a sorted list. The last level of a 10-level structure would contain 512 cubes. Still, this type of structure wouldn’t be bad at all for working with, say, 100 cubes. Your bottom row would only have 64 items.

The really nice thing, though, is that with 100 items you never have to make more than seven moves to insert or remove an item. That’s a far cry from the 100 moves you’d have to make if you were to insert something at the head of a sorted list containing 100 items.

What I have described above is a binary heap; a very useful, efficient, and easy to implement type of priority queue. Although slower than a sorted list when removing items, it’s hugely faster on insertion, more than making up the difference. Considering that you can’t remove more than you insert, it turns out that the “expensive” removal of binary heap is more than offset by the very inexpensive (in comparison to a sorted list) insertion.

I’ve found that too many textbooks and programming tutorials make binary heaps look a lot harder than they are. You can see from the description of the process above that they are very simple, requiring just a few rules that a child can master after maybe an hour of practice. As you’ll see in my next post, it’s almost that easy to represent a binary heap in code. Stay tuned.

Priority queues

Your job today is to sit behind a counter and manage a group of objects (two-inch cubes, each with a unique number) that people will be working with. Specifically, you have to:

  1. Collect cubes that people come by and drop off.
  2. When somebody asks for a cube, you have to give him the lowest-numbered cube you currently have in your possession. For brevity, we’ll call that the “smallest” cube.

That’s all you have to do.

If you only have to keep track of a few items, you can throw cubes haphazardly on your counter for storage, and quickly pick out the smallest one by inspection whenever somebody asks for it. But if you get more than seven or eight cubes, finding the smallest one starts to take a little time. If you had 100 cubes lying on your desk, unordered, you’d have to search very carefully to find the smallest one.

If you’re told that you have to keep track of 10 or 20 cubes, you could line them up on the counter in order, with the smallest on the left and the largest on the right. If somebody comes by asking for the smallest cube, you just pick it off the left side. If somebody drops one off, you make space for it in the list and put it in the proper spot.

Making space for a new cube in the list involves moving all the cubes from insertion point forward, down one space. So, given this initial list:

    03  05  11  23  37  41

If you need to insert cube number 7, first you have to make space by moving everything down one:

    03  05  XX  11  23  37  41

Then you can put cube 7 where it belongs: in the empty spot shown above by the XX.

That’s not a problem when you have a few dozen items. It gets tiresome pretty quickly, though, if you have 100 or 1,000 cubes to track. Imagine that the list above is just the first part of a list of 1,000 entries. If you insert a new cube with the number 4, you would have to move 998 cubes just to make room for the one you’re inserting. Ouch.

If you have to keep track of more than a few dozen items, the technique of keeping them in order like that becomes time consuming because every time somebody drops off a cube you have to move a bunch of others in order to make space for the new one. Getting the smallest item is trivial, of course, but if you had to insert an item near the front of the list, you’d end up moving almost 1,000 items to make room for it. You’d also need a really long counter. You’d spend a whole lot of time making room for new cubes or shifting cubes to the front as items are removed. The people wanting their smallest cube will have to wait while you’re shuffling things around.

The two tasks you’ve been assigned are exactly the tasks that the priority queue abstract data type performs.

Lots of common algorithms use priority queues, including selection algorithmsgraph searching, and Huffman coding, to name just a few. You can also sort with a priority queue

The priority queue, as I said, is an abstract data type that describes what the thing does. It says nothing about how it’s done, and in fact there are many ways to implement a priority queue. I described two of them above: an unsorted list, and a sorted list. Those are simple to implement, but they’re terribly inefficient.

For example, adding an item to an unordered list is very fast (just add it to the end of the list). But to get the smallest item you have to:

  1. Search the entire list to find the smallest item.
  2. Move the item from the end of the list to replace the item that you remove.

The code below shows how you would implement the Insert and Remove methods for a priority queue of integers. It assumes that the items are stored in a List<int>.

    void Insert(int item)
    {
        _items.Add(item);
    }

    int RemoveFirst()
    {
        int minIndex = 0;
        for (int i = 1; i < _items.Count; ++i)
        {
            if (_items[i] < _items[minIndex])
            {
                minIndex = i;
            }
        }
        int result = _items[minIndex];
        // now remove the item by moving the last
        // item to where the minimum item was.
        _items[minIndex] = _items[_items.Count-1];
        _items.RemoveAt(_items.Count-1];
    }

Inserting is very fast; it takes constant time. Removing an item requires that you search the entire list. So the longer the list gets, the longer it takes to remove an item.

You can do slightly better with a sorted list. If you maintain the list in descending order, then removing an item takes constant time; you just remove the item from the end of the list. Inserting an item, though, requires that you:

  1. Perform a binary search to find where the item is supposed to go.
  2. Move everything down one place to make room for the new item.

The code looks something like this:

    void Insert(int item)
    {
        int insertionSpot = 
            _items.BinarySearch(item, new ReverseIntComparer());
        if (insertionSpot < 0)
        {
            insertionSpot = ~insertionSpot;
        }
        // insert it at the proper position
        _items.Insert(item);
    }

    class ReverseIntComparer: IComparar<int>
    {
        public int Compare(int x, int y)
        {
            return y.CompareTo(x);
        }
    }

    int RemoveFirst()
    {
        int result = _items[_items.Count-1];
        _items.RemoveAt(_items.Count-1];
    }

Finding the insertion spot is pretty fast; it takes a maximum of 10 probes for BinarySearch to find an item in a sorted list of 1,000. The call to _items.Insert, though, will move everything down by one position to make room for the new item. Assuming that items are inserted in random order, Insert will have to move an average of Count/2 items. If items are inserted in ascending order, then every item is added to the head of the list and Insert will always move Count items.

So with an unsorted list Insert is fast and Remove is slow. With a sorted list, Remove is fast and Insert is slow. In general, a sorted list will perform faster than an unsorted list, but not by a whole lot.

One other common operation on priority queues makes the sorted list preferable. Peek lets you look at the smallest item without removing it. That would be fast with the sorted list, but slow with the unsorted list because you have to search for the smallest item.

It would be nice if we could make both of those operations fast, and there are many ways to do that. A popular and well-performing priority queue implementation is the heap data structure, which will be the subject of my next several posts.

Reading data from streams, part 3

This is the third and last in my short series about using streams in C#. The other two entries are Part 1 and Part 2. In this final part, I want to talk a little about messaging protocols.

Many programmers get their introduction to non-file streams by trying to write a simple chat application: something that lets two computers connect and trade messages. It’s a good way to familiarize oneself with networking concepts and .NET objects like SocketTcpClient, and TcpListener, and NetworkStream.

Very often, the programmer gets stumped because he expects the receiver to receive things exactly as the sender sent them.For example, he might write this sending code:

    TcpClient client = InitializeConnection();
    NetworkStream strm = client.GetStream();
    string msg = "Hello, world";
    byte[] msgBytes = Encoding.UTF8.GetBytes(msg);
    strm.Write(msgBytes, 0, msgBytes.Length);

And the receiver:

    TcpClient client = InitializeConnection();
    NetworkSTream strm = client.GetStream();
    byte[] buffer = new byte[1024]; // largest message is 1024 bytes
    int bytesRead = strm.Read(buffer, 0, buffer.Length);
    string msg = Encoding.UTF8.GetString(buffer);
    Console.WriteLine(msg);

I’m not going to worry here about how the TcpClient instances are initialized. The point here is how to use the NetworkStream. I’m also going to limit my discussion to NetworkStream because it’s simpler, it matches what I’ve been talking about, and you’ll likely end up using it if you use TcpClient. You really don’t need to delve into the mysterious world of sockets for most applications.

If you read Part 1 of this series, you’ll understand the first problem with the code abovecode: there’s no guarantee that the call to Read will return all of the bytes that the sender sent. It might only get the first five characters.

Part 1 also showed how to call Read in a loop until you’ve received all the data you expect, or until the end of the stream is reached. But that won’t work in this case. If I were to write the loop:

    int totalBytesRead = 0;
    while ((bytesRead = 
        strm.Read(buffer, totalBytesRead, buffer.Length-totalBytesRead)) != 0)
    {
        totalBytesRead += bytesRead;
    }
    string msg = Encoding.UTF8.GetString(buffer, 0, totalBytesRead);
    Console.WriteLine(msg);

That code will never terminate. Well, it will: when the connection is closed or when the sender has sent 1,024 bytes. The loop will read everything available and the block waiting for more. Remember, documentation for Stream.Read says:

The implementation will block until at least one byte of data can be read, in the event that no data is available. Read returns 0 only when there is no more data in the stream and no more is expected (such as a closed socket or end of file).

The documentation for NetworkStream is, I think, somewhat ambiguous on the issue:

This method reads data into the buffer parameter and returns the number of bytes successfully read. If no data is available for reading, the Read method returns 0. The Read operation reads as much data as is available, up to the number of bytes specified by the size parameter. If the remote host shuts down the connection, and all available data has been received, the Read method completes immediately and return zero bytes.

It should point out explicitly that the only time it will return 0 is when the underlying socket is closed and there are no more bytes available.

So we have what looks like a Catch-22: we can’t assume that a single read will give us what we need, and we can’t continue to read until there isn’t any more because we can’t tell where the end is. All we know is that we haven’t received any more yet.

How to resolve this situation?

What makes a text file different from other types of files? There’s no file system support for “text files.” It’s not like NTFS has a CreateTextFile function that you call to create a text file. As far as the file system is concerned, a text file is a database file is an image file is a music file, etc. They’re all just streams of bytes. A text file is just a stream of bytes to which we have added some order. Specifically, we’ve agreed that the file contains lines that are delimited by newline characters. The data describes its own format.

The simple solution to the problem above is to terminate each message with a newline. So the sender would send "Hello, world\n". The receiving end is a little more involved:

    int totalBytesRead = 0;
    while ((bytesRead = 
        strm.Read(buffer, totalBytesRead, buffer.Length-totalBytesRead)) != 0)
    {
        int start = totalBytesRead;
        totalBytesRead += bytesRead;
        int newlinePos = SearchBufferForNewline(buffer, start, totalBytesRead-start);
        if (newlinePos != -1)
        {
            // found a newline in the buffer.
            // get that message string and display it
            string msg = Encoding.UTF8.GetString(buffer, 0, newlinePos);
            Console.WriteLine(msg);
            // now copy remaining bytes to the front of the buffer,
            // and adjust totalBytesRead as appropriate
            totalBytesRead = totalBytesRead - newlinePos;
            Buffer.BlockCopy(
                buffer, newlinePos+1,
                buffer, 0,
                totalBytesRead);
         }
    }

Or you could go one better and wrap a StreamReader around the NetworkStream:

    var reader = new StreamReader(strm);
    string line;
    while ((line = reader.ReadLine()) != null)
    {
        Console.WriteLine(line);
    }

The point is that the stream itself isn’t going to tell you where the end of a message is.

Typically, there are three ways to know how much data you need to read:

  1. Messages are a known fixed length in bytes, padded with spaces or zeros or some other character. For example, you might decide that all message packets between machines will be 128 bytes long. It’s wasteful, but it makes programming pretty easy.
  2. Messages are delimited, as in the text example above. This works very well for text protocols. You can make it work with binary protocols, but it takes some special code because the delimiter might very well be part of the binary data. For example you might say that records are delimited by a byte with the value 0xFF. But what happens if 0xFF occurs as part of a record?
  3. The first bytes of a message include a fixed-length header that says how many bytes follow. This is very effective. For example, say the first two bytes of the message header are a short integer that says how long the rest of the message is. Your receiver then has to read the first two bytes, convert that to a short integer, and then read that many following bytes. Using the loop, of course, to ensure that it doesn’t get a short packet.

There are variations on the above. For example the first byte of the message could be the message type, and you know that message type 1 is 27 bytes long, message type 2 is 18 bytes, etc. Again, the point is that the data, not the stream, knows how long it is. All the stream knows is when it’s reached its end. And that only because the sender closed the connection.

So that’s the short course on things that trip people up when they’re working with streams. It’s unfortunate that most tutorials leave a lot of that stuff out. Sometimes the tutorial contains the information but it’s not explored in detail. And very often programmers see the first part of a tutorial and figure, “I’ve got it.” We’re an impatient lot, sometimes. Whatever the case, I see lots of Stack Overflow questions describing problems that are caused by one or more of the misunderstandings I talked about in this series.

Happy coding.

Preparing to fail

Before she went to bed last night, Debra took the chicken out of the crock pot where it had been cooking all day and set it in a container on the counter to cool. She asked me to put it in the refrigerator before I went to bed. Stupidly, I agreed.

After she went to bed, I got up off the couch and went to do some work on the computer for a few hours until I started nodding off. I got up, turned off the lights, and went to bed.

Yes, I left the chicken sitting out on the counter. Three pounds of chicken and a full day of cooking wasted. I’m pretty relaxed when it comes to what constitutes “bad” food, but even I wouldn’t chance eating chicken that’s been sitting out at room temperature for eight hours.

I should have known better than to agree to do something “before I go to bed.” I should have set an alarm on my phone for an hour or two later rather than say to myself, “remember to put the chicken away before going to bed.” I don’t consult a To Do list before I go to bed. I have a routine that I rarely deviate from. By agreeing to modify that routine, I set myself up for failure.

So now Debra is understandably disappointed with me. She also has to go buy more chicken and spend the day cooking it. Again.

Next time I’ll set an alarm.

Do not suspend or abort threads

I’ve said a number of times that calling Thread.Abort is like stopping a car by shooting the driver in the head. The car will stop, but there’s no telling what damage it’ll do in the process.

Calling Thread.Abort throws an exception that will break execution. You can catch ThreadAbortException and even prevent it from killing your thread, but you can’t prevent the thread abort from breaking execution. For example, say you have a thread that’s executing this method:

    void MyThreadProc(object state)
    {
        while (true) // go forever
        {
            // process an item
        }
        Console.WriteLine("Thread exited normally.");
    }

If the main thread calls Abort on that thread, the thread will terminate. Whatever the thread was doing is aborted and the thread exits the method without writing anything to the console.

The first and most important problem here is that when you call Abort, whatever is happening in that loop is interrupted. Your program could be in the middle of a critical multi-part data update, could be holding a mutex, have allocated critical system resources, etc. When the thread aborted, the update is left unfinished. Unless you’re religious about using try/finally, any mutex you held continues to be held, critical system resources aren’t released, etc. Even if you do clean up allocated resources, you still have the problem of an unfinished data update. Your program is left in an unknown and therefore potentially corrupt state from which you can’t reliably recover.

Catching ThreadAbortException helps in that you can probably structure your code so that the exception handler does some cleanup and even notifies you that the thread was terminated unexpectedly.

    void MyThreadProc(object state)
    {
        try
        {
            while (true) // go forever
            {
                // process an item
            }
            Console.WriteLine("Thread exited normally.");
        }
        catch (ThreadAbortException)
        {
            Console.WriteLine("Thread aborted.");
            // do cleanup here
        }
        Console.WriteLine("End of thread.");
    }

Now if you abort the thread, the catch block is executed. Still, whatever the thread was doing inside the loop is interrupted. And ThreadAbortException is special. Unlike most other exceptions, which are done once you catch them, ThreadAbortException is re-thrown by the runtime. You would never see the “End of thread” message here if the thread is aborted.

The discussion above should disabuse you of the notion that depending on Thread.Abort is a good idea. But let’s say that you know your thread isn’t doing any of those potentially dangerous things, so you think you’re safe.

Then comes the next problem: you can’t guarantee that Thread.Abort will actually abort the thread. The thread could catch ThreadAbortException and then call Thread.ResetAbort to prevent the exception from being re-thrown. That prevents Thread.Abort from terminating the thread. For example:

    void MyThreadProc(object state)
    {
        while (true)
        {
            try
            {
                while (true) // go forever
                {
                    // process an item
                }
                Console.WriteLine("Thread exited normally.");
            }
            catch (ThreadAbortException)
            {
                Console.WriteLine("No! I'll stop when I'm good and ready.");
                Thread.ResetAbort();
            }
        }
        Console.WriteLine("End of thread.");
    }

That doesn’t prevent Abort from interrupting the loop, though. Even with >ResetAbort, whatever is happening in the inner loop is vulnerable. You can’t prevent Abort from interrupting the thread.

Don’t make the mistake of thinking that surrounding your critical code with calls to Thread.BeginCriticalRegion and Thread.EndCriticalRegion will solve the problem. Those might prevent the runtime host from throwing ThreadAbortException, true. It might instead just tear down the app domain, terminating the entire program. Oops.

It should be clear by now that calling Thread.Abort is either dangerous or useless. There is no good reason to use it. Use cancellation if you need the ability to terminate threads. That way, the thread cooperates in the shutdown and nothing critical gets interrupted; the thread can exit when it is safe to do so.

As bad as Thread.Abort is, Thread.Suspend is even worse. You can potentially write code that gracefully handles a thread abort because all relevant finally blocks are executed when ThreadAbortException is thrown. The aborted thread could have finally blocks that free allocated resources, release locks, etc. But Thread.Suspend? That stops a thread in its tracks. The thread just halts. Instantly. Whatever it was doing is suspended until some code calls Thread.Resume. There is no unwinding of the stack or executing finally blocks. The. Thread. Just. Stops. So locks are held, resources remain allocated, updates remain “in progress,” etc. The thread is frozen in time.

That’s never a good idea.

There are much safer ways to make threads suspend processing. Perhaps the easiest is to implement something similar to cancellation, using a ManualResetEventSlim. For example:

    ManualResetEvent ContinueFlag = new ManualResetEventSlim(true);

    void MyThreadProc(object state)
    {
        while (ContinueFlag.Wait())
        {
            // process
        }
    }

The call to Wait will block until something sets the event. The event remains set until some code resets it. If you want to pause the thread, simply call ContinueFlag.Reset. The thread will finish the current iteration of the loop, and then block on the call to Wait until some other thread calls ContinueFlag.Set.

If you want to support cancellation as well as pausing, you can pass the cancellation token to Wait, like this:

   try
    {
        while (ContinueFlag.Wait(cancelToken))
        {
            // process
        }
    }
    catch (OperationCancelledException)
    {
        Console.WriteLine("Cancellation requested.");
    }

Again, the thread participates in the cancellation, so whatever is happening in the loop is not affected by the cancellation.

If the continue flag is a ManualResetEvent, you’ll have to use a technique similar to that described in How to: Listen for Cancellation Requests That Have Wait Handles.

Your best bet is to forget that Thread.AbortThread.Suspend, and Thread.Resume even exist. I can think of very few good uses for Abort (think of it as the goto of multithreaded programming), and no good use for Suspend and Resume. Don’t use them. If you do, you’ll regret it.

Reading data from streams, part 2

This is the second in a short series about reading data from streams in C#. In the first post I showed that you can’t depend on a Stream to give you all the bytes you ask for when you call Read, even if you know that those bytes are or will soon be available.

A lot of programmers don’t understand that point when they first start working with streams other than FileStream, and when they figure it out or somebody explains it to them, they often ask something like:

Why don’t streams work like files?

Which is the wrong question to ask. FileStream is a special case of Stream, not the other way around. FileStream works the way it does because it can. But not all streams can work the way file streams can, because not all types of streams have all the information that a file system has. So let’s forget about file streams for a bit and talk about why the Stream API works the way it does, which is the real question those programmers are asking.

Stream is, well, a stream or river of bytes. Imagine sitting at the mouth of the Mississippi River, watching the water flow into the Gulf of Mexico. Does that stream of water have a beginning? Certainly the river has a beginning somewhere up in northern Minnesota, but that’s not where the stream of water begins. Nor is it where all the water comes from. The water comes from all over the country and has been flowing down to the Gulf in its current form since the last Ice Age. And although I’m certain there will be an end to that mighty river at some point in the future, I’m convinced that I won’t see it. The water just keeps on flowing. For all practical purposes, the Mississippi river had no beginning and has no end.

Stream is just like that. There’s a beginning, but once you’ve consumed data from a stream, you can’t get it back. Just like you can’t get the water back from the Mississippi once it’s flowed into the Gulf. You also can be assured that there will be an end to the stream. Eventually. But you don’t know where that end is. The stream might end after the next byte, or there might be several terabytes still to be read. You can’t know, in the general case.

Let’s say you write a program that reads data from a stream in 64 kilobyte chunks, and processes those chunks. Imagine that there’s a “give me everything I asked for” function that always returns exactly the number of bytes that you ask for, unless the stream ends in which case it gives you everything up to the end of the stream. In short, it works the way that FileStream.Read appears to work. I’ll call the function ReadExact, since “give me everything I asked for” is a handful to type and typing it annoys me because I rarely get everything I ask for anyway. Digressions aside, the program would look like this.

    byte[] buffer = new byte[65536];
    // ReadExact returns 0 when you try to read
    // after end of stream has been reached.
    int bytesRead = theStream.ReadExact(buffer);
    while (bytesRead != 0)
    {
        // process the buffer
        ProcessData(buffer, bytesRead);
        // and refill the buffer
        bytesRead = theStream.ReadExact(buffer);
    }

With that code, you would end up calling ProcessData with a long sequence of buffers containing 65,536 bytes, and one final buffer that contains fewer bytes than that. Unless the stream’s length was an exact multiple of 65,536 bytes.

And there’s my first point. Unless you can guarantee that the stream’s length will be an exact multiple of whatever buffer size you choose, your program has to handle short buffers. That is, you must be able to handle the case that ReadExact returned fewer bytes than you asked for. In this case, ReadExact doesn’t provide any benefit over the existing Read functionality.

Still, there might be use for ReadExact when you only need 100 bytes and you know that the stream has 100 bytes. Except you can’t know. Maybe somebody cut the network cable and only 50 bytes came across. Or maybe the sender got hung up or there was a bug and although it said 100 bytes were to follow, only 23 bytes came down the wire. Any number of things can happen. How should ReadExact react in any of those situations? Should it time out or wait forever? Is a timeout or end of stream an error or an expected condition? The desired behavior is different for every application. There just isn’t a generally accepted way for such a method to work. That’s another reason why it isn’t supplied.

Not only are streams essentially infinite, they’re also arbitrarily fast (or slow). Many programs (perhaps most?) can process data a whole heck of a lot faster than they can read the data. It makes sense to process data in small chunks as it comes in rather than to read everything and then process it. It might not even be possible to read everything because, again, the stream is essentially infinite.

For all those reasons, it just doesn’t make sense to have a general ReadExact method. You’re free to create one, but I think you’ll find that it’s not as useful as you thought it would be when you started. It’s just easier to embrace the way that streams work, and write your programs so that they can handle data as it trickles in.

Last time I said that FileStream.Read appears to always fill the buffer, except when it reaches end of file. I went digging in the .NET Framework Source, to see what FileStream.Read really does, and I found this comment:

    // We may have read less than the number of bytes the user asked
    // for, but that is part of the Stream contract.  Reading again for
    // more data may cause us to block if we're using a device with
    // no clear end of file, such as a serial port or pipe.  If we
    // blocked here & this code was used with redirected pipes for a
    // process's standard output, this can lead to deadlocks involving
    // two processes. But leave this here for files to avoid what would
    // probably be a breaking change.         --

Then follows some code that refreshes the stream buffer if the file isn’t a pipe handle, and tries to fulfill the read request.

In other words, if you’re using FileStream.Read on a normal disk or network file, then the current implementation ensures that you get what you ask for. But if you just have a FileStream from some unknown source, there is no such guarantee. And, as I pointed out last time and the comment verifies, that behavior is just an implementation detail. It’s unlikely that the .NET Framework team will change this behavior, so you’re probably safe writing code that assumes FileStream.Read will fill the buffer if it can, provided you know that the stream is wrapping a file handle. But if you don’t know where the FileStream came from (i.e. it’s a pipe), then counting on that behavior could get you into trouble.

Overall, I’d say you’re better off writing code that checks the return value of FileStream.Read, and reads more if the buffer wasn’t full and end of stream wasn’t reached.

The real reason the Stream API works the way it does is because it has to work that way. The API exactly reflects how data streams work.

FileStream works the way it does because it can. The file system contains a lot of information about a file (the size, in particular) that can’t be known about streams in general, and because the file is held there on disk you can seek to the start or end of the file, or anywhere in the middle to read or write whatever you want. And the file system is fast when compared to streams in general. A FileStream is really a bunch of file-related functions that, among the many things it does, lets you read the file in the same way you would read any other type of Stream. Expecting the Stream API to work like a FileStream is kind of like expecting a flintlock to work like an AK-47.

In this entry and the first one, I’ve been discussing streams of bytes without assuming any format to those bytes. Most streams aren’t just bytes. Rather, they’re bytes with some defined format. Maybe they’re lines of text, binary records that are 37 bytes long each, or commands separated by newlines, semicolons, or some other delimiters. Very often, those lines, records, or commands are part of a client-server setup that uses a request/response model: the client makes a request, the server sends a response, client makes another request, etc. Because of the way streams work, handling those types of requests can be a little more involved than you might expect. That’s what I’ll talk about next time.

Reading from streams, part 1

This is the first in a short series about reading data from streams in C#. In this initial post, I’m going to discuss reading from a stream that has a defined end. You might not know where the end is, but you can be assured that if you continue reading you will reach the end. This is common with files, for example, and when reading data from Web requests (downloading a Web page). I’ll discuss messaging protocols in a subsequent post.

C# programmers who are accustomed to reading files with FileStream often become confused when working with NetworkStream or other types that derive from Stream or that have an underlying Stream, like SerialPort. The confusion arises from assumptions about how the Read method works.

The code to read 100 bytes from a FileStream is straightforward; you request 100 bytes and that’s what you get. For example:

    byte[] buffer = new byte[100];
    int bytesRead;
    using (FileStream f = File.OpenRead(filename))
    {
        bytesRead = f.Read(buffer, 0, 100); // request 100 bytes
    }
    // do something with the data read

Actually, I lied. I said that when you ask for 100 bytes, that’s what you get. In truth, you get up to 100 bytes. As the documentation for FileStream.Read says:

The returned value is the actual number of bytes read, or zero if the end of the stream is reached.

The Read method returns zero only after reaching the end of the stream. Otherwise, Read always reads at least one byte from the stream before returning. If no data is available from the stream upon a call to Read, the method will block until at least one byte of data can be returned. An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached.

(The emphasis is mine.)

In practice, at least in my experience when writing .NET applications on Windows, FileStream.Read always returns the number of bytes requested or the number of bytes remaining in the file. So if I ask for 100 bytes and the file is only 85 bytes long, I’ll get 85 bytes. The function only returns zero if I asked to read when the file was already positioned at the end, and it never returns less than I asked for unless it reaches the end of the file. So in the code above, the value of bytesRead would always be either 100 or, if the file contains fewer than 100 bytes, the total number of bytes in the file.

It turns out, though, that depending on this behavior is wrong, as you will see below.

The documentation for Stream.Read has wording that’s somewhat similar to that of FileStream.Read:

Implementations of this method read a maximum of count bytes from the current stream and store them in buffer beginning at offset. The current position within the stream is advanced by the number of bytes read; however, if an exception occurs, the current position within the stream remains unchanged. Implementations return the number of bytes read. The implementation will block until at least one byte of data can be read, in the event that no data is available. Read returns 0 only when there is no more data in the stream and no more is expected (such as a closed socket or end of file). An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached.

Again, the emphasis is mine.

The last sentence is particularly important. NetworkStream.Read and SerialPort.Read frequently return fewer bytes than requested. As an illustration, consider this code that reads the HTML from my blog.

    HttpWebRequest request =
        (HttpWebRequest)WebRequest.Create("http://blog.mischel.com/");
    using (HttpWebResponse response =
        (HttpWebResponse) request.GetResponse())
    {
        // limit data to 1 megabyte
        const int maxLength = 1024*1024;
        byte[] buffer = new byte[maxLength];
        int totalBytesRead = 0;
        using (Stream s = response.GetResponseStream())
        {
            while (totalBytesRead < buffer.Length)
            {
                int bytesRead = s.Read(
                    buffer,
                    totalBytesRead, 
                    buffer.Length - totalBytesRead);
                if (bytesRead == 0)
                    break;
                Console.WriteLine("{0:N0} bytes read.", bytesRead);
                totalBytesRead += bytesRead;
            }
        }
        Console.WriteLine("Done: {0:N0} total bytes read.", totalBytesRead);
    }

Note that although there is a ContentLength property in the response object, some Web sites reply with an invalid Content-Length header and others supply 0 or -1. Rather than handle those exceptional cases, I just set an upper limit on the size of download I will accept, and read up to that many bytes, or until I reach the end of the stream.

When I run that program, here’s the output I get:

    1,048,576 bytes requested. 123 bytes read.
    1,048,453 bytes requested. 43 bytes read.
    1,048,410 bytes requested. 12 bytes read.
    ...
    ... Lots of other read requests
    ...
    903,066 bytes requested. 5 bytes read.
    903,061 bytes requested. 15 bytes read.
    903,046 bytes requested. 99 bytes read.
    902,947 bytes requested. 16 bytes read.
    Done: 145,645 total bytes read in 380 requests.

On the first call I asked for a megabyte. I got 123 bytes. The number of bytes I received when making requests varied from 1 to about 10,000.

As you can see, you can’t depend on a stream to give you all the bytes you requested. You have to keep asking it for pieces until it’s given you what you want, or until it returns 0 to indicate that there isn’t any more data forthcoming.

I pointed out above that FileStream.Read doesn’t appear to have this behavior. But the documentation indicates that it could. Apparently the .NET development team thinks it could, too, because the File.ReadAllBytes method allocates a buffer and does multiple reads. That is, it does essentially this:

    using (FileStream fs = new FileStream(...))
    {
        int count = (int)fs.Length;
        byte[] bytes = new byte[count];
        int index = 0;
        while (count > 0)
        {
            int n = fs.Read(bytes, index, count);
            if (n == 0)
            {
                // error, unexpected end of file
            }
            index += n;
            count -= n;
        }
        return bytes;
    }

I removed some error checking (in particular, the length check to ensure that the file isn’t larger than 2 gigabytes), but it’s plain that they don’t expect FileStream.Read to return them the exact number of bytes requested.

Next time I’ll explain a bit about why streams work this way.

Making a collection immutable

A friend recently asked for suggestions about creating a class that is mutable up to a point (while it’s being constructed), and then immutable once clients begin to use it. His solution was to add a flag that could be set to make the thing read-only. For example, the class below is mutable up to the point when the _inUse field is set.

    public static class LabelsArray
    {
        private static Dictionary<int, string> _items = new Dictionary<int,string>();
        private static bool _inUse = false;

        public static void SetInUse()
        {
            _inUse = true;
        }

        public static void LoadFromFile(string f)
        {
            if (_inUse)
                throw new ApplicationException("no modifications allowed.");
            // load labels from a file
        }

        public static void AddLabel(int key, string val)
        {
            if (_inUse)
                throw new ApplicationException("no modifications allowed.");
            // add a label
        }

        public static string this[int key]
        {
            get { return _items[key]; }
        }

        public static bool TryGetValue(int key, out string val)
        {
            return _items.TryGetValue(key, out val);
        }
    }

Code that uses this class would initialize it by calling LoadFromFile and AddKeyValue as appropriate and, before creating any objects that use the labels, would call SetInUse to prevent it from being modified. Something like this:

    LabelsArray.LoadFromFile("foo");
    LabelsArray.AddKeyValue("joe", "programmer");
    LabelsArray.SetInUse();
    // The object can no longer be modified

His application was somewhat more complex than that, but that’s essentially what he described.

In a program that only needs one LabelsArray, this model works well. It’s quick to implement and does the job. It’s not a pattern you’d want to use if you have multiple labels arrays, though, unless you want to duplicate all that code for every different array. Also, that _inUse flag is kind of messy. Wouldn’t it be nice if you could implement this without having to resort to a flag?

How about writing code that constructs a dictionary and then wraps it in a read-only container? Something like this:

    public ReadOnlyDictionary<int, string> CreateLabelsArray()
    {
        var dict = new Dictionary<int, string>();
        // code here loads data from file,
        // adds other key/value pairs, etc.
        // And, finally:
        return new ReadOnlyDictionary<int, string>(dict);
    }

Clients make a single call to create the dictionary and get a read-only reference to it:

    var myLabels = CreateLabelsArray();

ReadOnlyDictionary creates a read-only wrapper around the dictionary that you pass to the constructor. Because the CreateLabelsArray method created the original dictionary and only the ReadOnlyDictionary instance maintains a reference to it, there is no way that the dictionary can be modified.

I like this solution because it eliminates the flag, and I can easily create multiple labels arrays. If I really need that labels array to be static, I can declare a static reference and avoid the static class. In my experience, you should be careful with static classes.

If the collection you’re working with isn’t a dictionary, you might need a ReadOnlyCollection (essentially an immutable List<T>) rather than a ReadOnlyDictionary.

Those two collections are very useful in ensuring that critical data is not inadvertently modified. You’ll be surprised by how many problems you can solve by using these two collection types with a technique similar to what I showed above. If ReadOnlyDictionary and ReadOnlyCollection don’t quite fit your requirement, it’s pretty easy to create your own class that works in much the same way.

In the example I showed above, the dictionary returned by CreateLabelsArray is guaranteed to be thread-safe because there is no possible way that the underlying dictionary can be modified. Clients can query it with impunity. If instead you were to write code that maintains a reference to the underlying mutable dictionary, then you have a potential problem. Clients that reference the ReadOnlyDictionary can’t modify it, but they might fail if some other part of the code that has a reference to the underlying dictionary modifies the dictionary concurrently with clients’ queries.

If you’re going to use ReadOnlyDictionary or ReadOnlyCollection, especially in multithreaded code, I strongly recommend that you use the model I showed above: create the underlying collection, instantiate the read-only version, and then discard your reference to the mutable collection. If you maintain a reference to the mutable collection, you could inadvertently modify it and create a problem that is often difficult to track down.