Unintended consequences

Changing any non-trivial program will almost invariably have unintended consequences. Nowhere is this more readily apparent than when you’re modifying a program that has multiple threads. A change in the behavior of one part of the program will change the way that it communicates with other threads–either directly in thread-to-thread messages or indirectly in the thread’s data access pattern. Assumptions about usage patterns creep in to the code, usually inadvertently, and when those assumptions are no longer true, all manner of odd things happen.

These things are usually obvious after the fact. For example, I have a program that loads a list of items from disk into a queue, and then begins processing those items. The program has many different threads that do various things, and it can process hundreds of items concurrently. Recently I got the brilliant idea of loading the queue asynchronously so that the program can begin processing items immediately. No reason, I thought, to wait the few minutes to load 10 million items before I start processing.

It was a fairly easy change to spawn a loader thread, and I had everything working again in a few minutes. I started the program, the loader thread began loading, and immediately my worker threads started processing items. All was well with the world and I congratulated myself on a job well done.

But there’s a twist. You see, in processing items, the worker threads put even more items in the queue. There’s a process that prunes the queue from time to time so that it doesn’t grow without bounds, but the key here is that when it comes time to shut down the program I need to save the queue state so that I can pick up where I left off next time the program starts. Still no problem, right?

Unless I try to shut down the program before it’s finished loading the queue. In that case, bad things happen. If I just assume that the loader is done, then the code that writes the queue to disk will fail when it tries to open the file for writing because the loader has it locked already. If I code the loader thread to see the shutdown message and stop loading, then the current contents of the queue in memory will overwrite the queue on disk–causing me to lose anything that wasn’t already loaded.

The only solution is to allow the loader to finish before trying to save the queue to disk.

In hindsight, of course, this is obvious. But it’s something that a lot of programmers would miss in the initial coding.