Back in October I received an email from the Sam Bass Community Theatre, inviting people to audition for a part in their holiday production of The Best Christmas Pageant Ever. I had no particular desire to act, but I went to volunteer as a stage hand or some such. They gratefully accepted my offer, and a few weeks later called to tell me they were ready for me to start.
I had spoken to the stage manager briefly and was waiting for instructions when the director approached me and asked if I’d consider taking over the role of an actor who had to leave unexpectedly. Having read the script, I knew that the role she was offering had only one scene with five lines of dialogue. She didn’t even ask me to audition; just gave me a script and pushed me backstage.
Seeing as how I only had that one scene, I also played stage hand: doing some minor set construction, arranging the set before each performance and during intermission, and riding herd on 20 children aged from about eight to sixteen. I thought it would be a struggle not to strangle a couple of those kids before the play had finished its run, but once they got into costume, they were very well behaved. Even young children know when it’s time to quit being childish and get to work.
Debra volunteered, too, but her schedule was already pretty full. She did manage to paint a custom backdrop based on the director’s sketch. That turned out great.
Even though it’d been nearly 50 years since the last time I was in a play (I was Dr. Dolittle in a production for my kindergarten class back in … 1967), stage fright wasn’t a problem. I’ve done enough public speaking that getting up in front of an audience of 50 people isn’t particularly daunting. Memorizing five lines of dialogue and delivering it reasonably well isn’t terribly difficult. By the time the first performance rolled around, I’d done the scene often enough that I felt very comfortable with it. Even so, I did manage to fumble a line during one or two performances. Fortunately, the woman with whom I did the scene is much more experienced than I and was able to make it look like nothing was amiss.
It was a big time commitment: about three hours a night for four or five nights per week, for about eight weeks. But I can’t remember the last time I had so much fun. I especially can’t remember the last time I had that much fun without having to spend any money. And I learned a bit about theater, more about myself, and met a lot of good people who I hope to work with again. I’ve already volunteered to help as a stage hand with their next production, and I’m planning to audition for a part in one of their upcoming productions in the spring.
It was altogether a thoroughly enjoyable experience. If you’ve ever wanted to give acting a try, find when your local community theater is having auditions and go audition! You don’t need any experience. They like new faces and inexperienced actors. Many of the people who produce and direct at community theater are teachers at heart. If you go in with a sincere desire to learn, they will welcome you and help you learn to act and to appreciate the art of theatre.
I can’t say enough good things about the director, stage manager, the other actors, and everybody else who I met there. Every one of them was friendly, welcoming, and helpful. The theatre is the people, and the people I met at Sam Bass Community Theatre are among the best ever.
I’ve uploaded the source code for my Merge LINQ extensions. This also includes the source for my DHeap class and for the TopBy extension method.
The code is supplied in a .zip file and includes a Visual Studio 2013 project file. If you have an earlier version of Visual Studio, you probably won’t be able to read the project file, but it’s easy enough to recreate one from the source files provided.
The code is all MIT License, meaning you can do with it what you like. I request that you let me know if it’s useful to you, but there’s no requirement.
Getting back to my discussion of merging multiple lists.
Testing relative performance of the different merge algorithms turned out to be more involved than I had originally thought it would be. I found the results somewhat surprising.
The short version is that in a normal procedural implementation, the pairwise merge is faster than the heap based merge, except in rare cases of many lists with very large differences in the lists’ lengths. But when it comes to LINQ implementation, the heap based merge is faster due to complications in the way that multiple criteria are handled.
I found the first result very surprising because the heap-based merge is provably optimum from a theoretical standpoint. But it turns out that managing the heap is expensive compared to just iterating lists. Even more surprising is that the difference in speed increases as the number of lists grows. That is, when merging five lists that are the same size the pairwise merge is perhaps 10% faster than the heap based merge. When merging 10 lists, it will be 15 to 20 percent faster. I found that result surprising.
At first I thought the difference was because my heap based merge implementation made use of my generic DHeap class, which adds some overhead. So I rewrote the merge to use a custom heap that’s optimized for this application. That made the heap merge faster, but not faster enough to overtake the pairwise merge.
The relative speed of the two implementations is somewhat sensitive to the keys being compared. As key comparisons become more expensive the heap based merge begins to outperform the pairwise merge. When merging strings, for example, the heap based merge is closer to the same speed as the pairwise merge. With very expensive comparisons, the heap based merge is clearly the faster of the two.
The reason for this is simple. The runtime complexity of the algorithms is based on the number of comparisons. The heap based merge will never do more than n log k comparisons, so when comparison speed becomes the limiting factor the algorithm that does the fewest comparisons will be the fastest.
After testing performance of the algorithms as standard method calls, I began creating LINQ-callable methods called
PairwiseMergeBy. The heap based merge conversion was straightforward and doesn’t look fundamentally different from the MergeWithBy extension method that I presented a while back.
The pairwise merge works by creating a merging network that does multiple merges in parallel. That is, given eight lists to merge, labeled ListA through ListH, it creates the following merge operations:
Merge A and B to create AB
Merge C and D to create CD
Merge E and F to create EF
Merge G and H to create GH
Merge AB and CD to create ABCD
Merge EF and GH to create EFGH
Merge ABCD and EFGH to create ABCDEFGH
It creates an inverted tree of merges that looks like this:
A B C D E F G H
| | | | | | | |
+--+--+ +--+--+ +--+--+ +--+--+
| | | |
So I have seven different merges running concurrently. The first problem is that this requires twice as much memory as the heap based merge. The heap based merge requires memory proportional to the number of lists (N) times the number of keys (K). That is, N*K memory.
The pairwise merge has N-1 merges running in parallel, each of which requires (2 * K). So the pairwise merge memory requirement is (2 * (N-1) * K) for the keys, and (N – 1) for the iterators. That’s not a huge deal, because the number of lists and the number of keys are both typically small. If we were merging millions of lists this might be a concern. Still, the additional memory is a drawback.
The bigger problem turned out to be structuring the code so that it worked and was understandable. If you’ve been following along, you know that implementing the
IOrderedEnumerable interface is complicated enough. Adding this complexity on top of it made things really convoluted. And by the time I got it all working, the pairwise merge just wasn’t consistently faster enough to justify the complexity. So I junked the idea. The final code uses the heap based merge.
MergeBy method is not an extension method. Rather, it’s a static method in the
public static IOrderedEnumerable MergeBy(
IComparer comparer = null);
There is also a
MergeByDescending method, and you can use the
ThenByDescending methods to add multiple merge keys.
I’ll publish the final code in the next posting. That will contain all of the merging code I’ve developed in this series.
One of the machines our software controls builds oligonucleotides by pushing various chemicals through a fine glass mesh. To do that, the machine dispenses the chemical into a common line and then pumps it to the destination. Logically, the machine looks like this:
+-----------+------+--------+- PUMP -+----+---------+------*---- Column ---- |----+
| | | |--------| | | |-----------------| |
Neutral A B | | |
| | |
+-------+-------+-----+ +-----------------------------+---- Waste
| | |
Argon S O
Flow is from left to right. The chemicals at the top are delivered by turning the pump. If we want to push 100 microliters of A through the column, we first open the Neutral valve and run the pump until the entire line through the column is filled with the Neutral solution. Then, the software closes the Neutral valve and switches the output to target waste rather than the column. Then the A valve is opened and the pump is turned long enough to inject 100 ?l into the line. Then, the output valve is switched to target the column, valve A is closed, Neutral is opened, and the pump turns to push the chemical through the column.
It’s a little more complicated than what I describe, of course, but that’s the gist of it. Pump delivery is relatively easy because we know the length (more importantly, the volumes) of each tube segment, and we know very precisely how much liquid is moved with each revolution of the pump. If we want 100 ?l and the pump moves 20 ?l per revolution, then we run the pump for five revolutions. The pump is operated by a stepper motor, so we can vary the pump speed by controlling the rate at which we issue step commands to the motor. All we have to do is tell the pump controller, “do X revolutions in Y seconds.” The controller takes care of it and reports back when it’s finished. If we want to deliver 20 ?l in three minutes, we tell the pump to do one revolution in 3,000 milliseconds. The motor controller figures out how many microsteps that is, and sends the motor pulses as required.
For reasons I won’t go into, some chemicals are better delivered by pushing with an inert gas (Argon, in this case). Each of the bottles on the bottom is sourced from a bottle that’s under positive pressure. If the S valve is opened, liquid starts flowing, and continues to flow until the valve is closed. We know that at our standard operating pressure, the liquid will flow at a rate of about one microliter every seven milliseconds. So if we want to inject 20 ?l of S, we open the S valve for 140 milliseconds. We can then close the S valve and open the Argon valve long enough to push the chemical through the column.
The problem comes when we want to vary the speed of Argon delivery. With the pump, all we have to do is slow the pump speed. But we can’t vary the bottle pressure so we have to toggle the Argon valve to simulate pumping. And that’s where things get kind of tricky.
Imagine, for example, that you want to deliver 20 ?l of S smoothly over three seconds, rather than at the standard rate of 140 ms. If we could switch the valve on and off quickly enough, that wouldn’t be too difficult. We already know that we can pump 1/7 ?l in a millisecond, so if we could toggle the valve every millisecond we could have 140 pulses of 1 ms each, with a 20 ms wait between each pulse. That would deliver in 2,940 ms.
Unfortunately, those valves don’t turn on and off instantly. It takes about 25 milliseconds for a valve to open after it’s energized. And it takes another 25 milliseconds for the valve to close completely after we cut power to it. So we can’t switch the valve faster than once every 50 milliseconds. And during at least part of that time the valve is partially open. So some liquid flows before the 25 ms is up. We can measure that, of course, and we could calibrate each valve if we wanted to. In practice, that calibration isn’t all that helpful. We can do almost as well by taking an average of all the valves in the system.
If you’re wondering how long 50 milliseconds is, a typical eye blink is between 300 and 400 milliseconds. So, whereas you can blink your eyes maybe three times per second, I can toggle these valves on and off 20 times per second.
The other problem is that the computer controlling the valve doesn’t have infinite resolution. Wait precision is typically about five milliseconds. If I want a seven millisecond wait, I’ll typically get something between 7 and 12 ms.
All those uncertainties mean that if I want to open the valve for what I think is 7 ms to dispense 1 ?l, it takes 57 ms and I could end up with 2 ?l flowing from the valve. This is what we end up calibrating. The tube volume from the farthest Argon-propelled valve to the waste is, say, 4,000 ?l. We’ll clear the line (blow it out with Argon), and then start toggling the S valve in one-?l pulses until liquid flows from the waste. If it takes 2,500 pulses, then we know that one pulse dispenses an average of 1.6 ?l. If it takes 5,000 pulses, we know that one pulse is, on average, 0.8 ?l. That’s the number we’ll use in our calculations.
Interfacing with the real world presents a lot of challenges, but it’s incredibly rewarding. I worked on this machine’s software for several months before I got to see one of them in operation; all my work had been against a simulator. It was very rewarding when I finally got to watch one of these machines work in response to user input, knowing that my software had converted the user’s commands into commands that make the machine switch valves and pump liquid through the system. Controlling hardware like that is the most rewarding type of programming I’ve ever done. I did it years ago, and I’m re-discovering just how much fun it can be.