Simplifying code with anonymous methods

One of the easiest ways to avoid writing and having to maintain duplicate code is to use an anonymous method or anonymous function. For example, I recently ran across some code that was validating dates in a UI application, and reporting errors if the dates were out of range. The code looked something like this:

    void DoStuff(MyModel model)
    {
        if (model.StartDate < DateTime.UtcNow)
        {
            var dt = ConvertUtcToLocal(model.StartDate, model.TimeZone);
            AddErrorMessage("Start Date " + dt.ToString("M/d/yy h:mm tt") + " is out of range.");
        }
        if (model.EndDate < DateTime.UtcNow)
        {
            var dt = ConvertUtcToLocal(model.EndDate, model.TimeZone);
            AddErrorMessage("End Date " + dt.ToString("M/d/yy h:mm tt") + " is out of range.");
        }
        // six more similar date/time validations
    }

Our client asked that the format be changed. Which meant going through and changing the format in eight different places. In the process I found two bugs in the code, both obvious copy-paste errors.

With a little bit of thought, the programmer who wrote this (or the last programmer to modify it) could have simplified things quite a bit and in the process made errors much less likely to occur. How? By creating an anonymous method that does the validation and error message formatting:

    void DoStuff(MyModel model)
    {
        var validateAndReport = new Action<string, DateTime>((fieldName, dt) =>
        {
            if (dt < DateTime.UtcNow)
            {
                var localDate = ConvertUtcToLocak(dt, model.TimeZone);
                AddErrorMessage(fieldName + " " + localDate.ToString("M/d/yy h:mm tt") + " is out of range.");
            }
        });
        
        validateAndReport("Start Date", model.StartDate);
        validateAndReport("End Date", model.EndDate);
        // etc.
    } 

Programmers are often hesitant to add simple methods like this outside of their functions because they think it pollutes the namespace, and they often require passing a bunch of parameters for context. Anonymous methods solve both problems: they’re invisible outside of the containing method, and they have access to all of the context that the containing method has access to. With this technique I ensure that all of the dates are formatted correctly, and I can modify the formatting by changing just one line of code. In addition, there are many fewer lines of code and they’re much easier to understand.

The only drawback I see is that programmers have to learn something new: “We can’t use this Action thing! Other programmers won’t understand it!” (Yes, somebody did say that to me.) Tough. Things change. To me and most other programmers I know, the modified code is clearly superior to the old code in every way that matters. I realize that there are some performance implications and hidden memory allocations involved. A few small allocations and a few microseconds aren’t going to make a difference in this code, or any other code in the project.

C# 7.0 introduced local functions: the ability to create a function within a function. This is more than just syntactical sugar around the anonymous function shown above. The local function still can capture local variables, but it avoids the allocation and slight performance hit of the anonymous method. The code above, if written to use a local function, would look something like this:

    void DoStuff(MyModel model)
    {
        void validateAndReport(string fieldName, DateTime dt)
        {
            if (dt < DateTime.UtcNow)
            {
                var localDate = ConvertUtcToLocak(dt, model.TimeZone);
                AddErrorMessage(fieldName + " " + localDate.ToString("M/d/yy h:mm tt") + " is out of range.");
            }
        };
        
        validateAndReport("Start Date", model.StartDate);
        validateAndReport("End Date", model.EndDate);
        // etc.
    }

Unfortunately, I’m not yet using C# 7.0, so I can’t take advantage of this new feature. I’m also unable to find the official language specification for version 7. However, the article Thoughts on C# 7 Local Functions gives some good information about how it works.

Pickle crack

When I was young–10 years old or even younger–Mom used to buy this dill pickle dip. It was creamy, tasty stuff with chunks of dill pickle in it. It was my favorite dip for potato chips. It was a real treat whenever Mom would get some or, when it became unavailable, she would make some from a recipe she obtained somewhere.

When Debra and I were planning a pool party here shortly after we bought the house, I got to thinking about that dip and called Mom to see if she still had the recipe. She did, and I wrote it down. I think I’ve since lost that piece of paper, but the recipe is real simple. Debra and I make it from time to time. Some friends say it’s addictive and they’ve named it “Pickle crack.”

Here’s what you need:

  • 16 ounces of cream cheese
  • 16 ounces of sour cream
  • A 24 ounce jar of kosher dill pickles
  • Garlic powder (not garlic salt!)
  • Dill weed (fresh if you like, but the dried stuff works great)
  • A mixing bowl that will hold a bit more than a quart
  • A fork (for mixing)
  • A knife (for cutting pickles)
  • A bag of potato chips (for taste testing)

It’s easier if you leave the cream cheese out of the refrigerator for an hour or two before you start. That’ll soften it up. But it’s not necessary.

Dump the cream cheese into the mixing bowl and start smashing it with the fork. I suspect there’s some technical term other than “smashing.” What I’m doing is making the stuff more spreadable. Otherwise it tends to clump. When the cream cheese is nice and creamy, add about half of the sour cream and mix it in. Once that’s thoroughly mixed, add the other half of the sour cream and mix it in until you have a nice uniform mixture.

Put that aside, open the pickle jar, and grab your knife. Start dicing pickles. 1/4 inch (6 mm) chunks are good. You can go smaller if you like. Or larger, if that’s your thing. I like to stay around 1/4 inch, but this is definitely a matter of taste. Debra likes to use the onion chopper to dice the pickles. That works, but I kind of like cutting them with a knife.

In any event, chop up a bunch of pickles and mix them in with the sour cream and cream cheese. I like my dip really chunky, so I typically chop all the pickles in the jar and throw them in. Others might like a little less pickle.

If the mixture is too thick (again, it’s a matter of taste), you can pour some pickle juice from the jar into the mixture to thin it a bit. Be careful, though. This is dip for potato chips, not for a veggie tray. Although I have used it for that once or twice. Carrots dipped in pickle dip is pretty good.

If you like garlic, add some garlic powder. Start with just a bit. Mix it in. Taste it. Maybe add some more. Same with the dill. If you want to accentuate the dill taste, throw in a bunch of dill weed. I suggest you exercise restraint, though. Mix the stuff up real good and then set it in the refrigerator overnight for the flavors to permeate. Pull it out in the morning and mix it up. Taste it. Add more seasoning as you see fit.

This stuff is quick to make and people really do seem to like it. Give it a try if you’re looking for a new taste treat for your potato or other chips.

 

Inconsistent error reporting

The systems I maintain call several different partner APIs to get player account information. We noticed recently that one of the APIs returns seemingly inconsistent error messages when presented with invalid player account numbers. For example, if we ask for information on account number 123456789, the system returns a 404, meaning that the account was not found in the system. But if ask for account number 3420859494, the server returns a 500: internal server error. Very strange.

At first I thought it was that the account number was too long. Typically, the account numbers are eight digits, and the ones giving us errors were 10 digits in length. But when I requested account number 1420859494 (10 digits), I got a 404.

On a hunch, I tried the magic number 2147483647. Not surprisingly, that returned a 404. But 2147483648 returned a 500 error. What the heck?

The magic of 2147483647 is that it’s the largest signed integer that can be expressed in 32 bits. It appears that the API tries to convert the input to a 32 bit integer. If that conversion fails, the server throws an exception and returns a 500 internal server error.

This is the kind of error that unit testing is designed to ferret out. Had they tested the edge cases, they would have seen this inconsistent behavior and, one would hope, modified the code to handle it more reasonably.

I don’t know what to do about this. Their documentation doesn’t say specifically that the input must be a positive 32-bit integer, only that the value must be numeric. Based on the empirical evidence, I could put some validation on my end to ensure that users don’t enter account numbers larger than 2147483647, but there’s nothing preventing the API from changing underneath me and allowing larger numbers in the future. It’s unlikely that those in charge of the partner API would notify me of such a change.

North Dakota Mexican food

Debra and I tried a new place for dinner the other night. When she asked me how my meal was I said, “North Dakota Mexican food.” She laughed and nodded. Considering that we were at a Vietnamese restaurant, that exchange probably deserves some explanation.

We flew to North Dakota back in 1992 to attend my grandparents’ 65th wedding anniversary celebration. When we arrived in Bismark, we learned that our bags didn’t make the transfer in Denver. The airline assured us that the bags would be on the next flight. Bismark wasn’t a thriving aviation hub at the time (not sure what it’s like now), so we had a few hours to kill.

In retrospect we should have known better, but we were hungry and frustrated. We saw a restaurant advertising Mexican food, and the parking lot was reasonably full. “How bad can it be?”

As it turns out, pretty bad. Worse than Taco Bell. It’s not that it tasted bad, but that it hardly tasted at all! It looked like Mexican food, but it tasted like … pretty much nothing. The ground beef was bland. The lettuce and tomatoes were standard tasteless restaurant fare. The cheese … well, there was lots of cheese. In fact, everything was covered in cheese. Like their version of Mexican food was a bunch of bland stuff covered in cheese. And the salsa? Pretty much just ground up stewed tomatoes with a little green onion thrown in. Corn chips were the cheapest, thinnest … I think you get the picture.

We had a similar experience about 10 years ago. We went to a little Mexican food place that several friends had recommended. The place was clean and bright and tastefully decorated, the service was good, and their food was indistinguishable from what we encountered in Bismark, North Dakota back in 1992. Except for the salsa. The salsa wasn’t good, but at least it had a little bite to it. With all the good Mexican food available in the Austin area, I honestly cannot understand how that place stays in business. Maybe there’s a bunch of North Dakota natives living nearby?

“North Dakota Mexican food” is our way of describing something that looks like what it’s trying to imitate, but has no taste or tastes completely wrong. I suppose the term could be used to describe things other than food, but I haven’t used it in that context.

Hacking the slot machine

Wired reports that a Russian group designed a brilliant slot machine cheat that they’ve used to bilk casinos out of millions of dollars. The article is sketchy on technical details, but there’s enough information there for me to speculate how it was done.

Understand, I don’t know anything about the internals of the software running on these machines, but I know enough about pseudorandom number generators, their use and misuse, to offer a plausible explanation of the vulnerability and how it’s exploited. I also know a few people in the industry. What I describe below is possible, and from my experience quite likely to have happened. Whether it’s exactly what happened, or if it’s even close to the mark, I have no way of knowing.

First, a little background.

In modern computer controlled slot machines (say, anything built in the last 30 years), the machine uses random numbers to determine the results of a spin. In concept, this is like rolling dice, but the number of possibilities is huge: on the order of about four billion. In theory, every one of those four billion outcomes is equally likely every time you roll the dice. That would be true in practice if the computer were using truely random numbers.

But computers are deterministic; they don’t do random. Instead, they use algorithms that simulate randomness. As a group, these algorithms are called pseudorandom number generators, or PRNGs. You can probably guess that PRNGs differ in how well they simulate true randomness. They also differ in ease of implementation, speed, and something called “period.” You see, a PRNG is just a mathematical way to define a deterministic, finite sequence. Given a starting state (called the seed), the PRNG will generate a finite set of values before it “wraps around” to beginning and starts generating the same sequence all over again. Period is the number of values generated before wrapping. If you know what PRNG is being used and you know the initial state (seed), then you know the sequence of numbers that will be generated.

The machines in question were purchased on the secondary market sometime before 2009. It’s probably safe to say that the machines were manufactured sometime between 1995 and 2005. During that era, the machines were almost certainly running 32 bit processors, and likely were generating 32 bit random numbers using PRNGs that maintained 32 bits of state. That means that there are 2^32 (four billion and change) possible starting states, each of which has a maximum period of 2^32. Overall, there are 2^64 possible states for the machine to be in. That’s a huge number, but it’s possible to compute and store every possible sequence so that, if somebody gave you a few dozen random numbers in sequence, you could find out where they came from and predict the next number. It’d take a few days and a few terabytes of disk storage to pre-compute all that, but you’d only have to do it once.

It’s likely that the PRNG used in these machines is a linear congruential generator which, although potentially good if implemented well, is easy to reverse-engineer. That is, given a relatively small sequence of generated numbers, it’s possible to compute the seed value and predict the next values in the sequence. All this can be done without knowing exactly which LCG algorithm is being used.

The hackers did have the source code of the program (or they disassembled the ROM), but they didn’t have access to the raw numbers as they were generated. Instead, they had to deduce the random number based on the outcome of the spin. But again, that just takes a little computation (okay, more than just a little, but not too much) time to create a table that maps reel positions to the random sequence.

My understanding is that slot machines continually generate random numbers on a schedule, even when the machine is idle. Every few milliseconds, a new number is generated. If it’s not used, then it’s discarded and the next number is generated. So if you know where you are in the sequence at a given time, then you can predict the number that will be generated at any time in the future. Assuming, of course, that your clock is synchronized with the slot machine’s clock.

If you refer back to the article, you’ll see that the agent who was working the machine would record the results of several spins, then walk away and consult his phone for a while before coming back to play. That phone consultation was almost certainly uploading the recorded information to the central site, which would crunch the numbers to determine where the machine was in the random sequence. The system knows which numbers in the sequence correspond to high payouts, so it can tell the phone app when to expect them. The agent then goes back to the machine and watches his phone while hovering his finger over the spin button. When the phone says spin, he hits the button.

The system isn’t perfect. With perhaps up to 200 random numbers being generated every second, and human reaction time being somewhat variable, no player will hit the big payout every time. But he’s increased his odds tremendously. Imagine somebody throwing one gold coin into a bucket of a million other coins, and another gold coin into a bucket of 200 other coins. You’re given the choice to blindly choose from one of the two buckets. Which would you choose from?

That might all sound complicated, but it’s really pretty simple in concept. All they did was create a map of the possibilities and devise a way to locate themselves on the map. Once you know where you are on the map, then the rest is a simple matter of counting your steps. Creating the map and the location algorithm likely took some doing, but it’s very simple in concept.

The above explanation is overly broad, I’ll admit, and I wave my hand over a number of technical details, but people at work with whom I’ve discussed this generally agree that this is, at least in broad strokes, how the hackers did it. Understand, I work at a company that develops slot machine games for mobile devices, and several of the programmers here used to work for companies that make real slot machines. They know how these machines work.

When I originally saw the article, I assumed that some programmer had made a mistake in coding or using the PRNG. But after thinking about it more critically, I believe that these machines are representative of the state of the art in that era (1995-2005). I don’t think there was a design or implementation failure here. The only failure would be that of not imagining that in a few years it would be possible for somebody who didn’t have a supercomputer to compute the current machine state in a few minutes and exploit that knowledge. This isn’t a story about incompetence on the part of the game programmers, but rather a story about the cleverness of the crooks who pulled it off. I can admire the technical prowess it took to achieve the hack while still condemning the act itself and the people who perpetrated it.

Welcome to the cesspool

Today President Trump signed an executive order titled REDUCING REGULATION AND CONTROLLING REGULATORY COSTS. This fulfills a campaign pledge to reduce burdensome regulation. On the face of it, I applaud the measure, especially the provision that says, “whenever an executive department or agency publicly proposes for notice and comment or otherwise promulgates a new regulation, it shall identify at least two existing regulations to be repealed.”

I suspect that the net effect of this order will be approximately zero, as far as business is concerned. In the first place, there are so many exceptions listed, it’s likely that any department head with a modicum of intelligence will get his proposals exempted from the new rules. And in the unlikely event that they do have to identify regulations to be repealed, they have a plethora of idiotic rules that are on the books and no longer enforced. It’ll take them years to clear that cruft from the books. So at best what we’ll see is large numbers of unenforced or unenforceable regulations being repealed. A Good Thing, no doubt, but not something that businesses will notice.

The Director of the Office of Management and Budget is tasked with specifying

“… processes for standardizing the measurement and estimation of regulatory costs; standards for determining what qualifies as new and offsetting regulations; standards for determining the costs of existing regulations that are considered for elimination; processes for accounting for costs in different fiscal years; methods to oversee the issuance of rules with costs offset by savings at different times or different agencies; and emergencies and other circumstances that might justify individual waivers of the requirements …”

It looks to me like the president’s order creates more regulations and more work, which probably will require increased expenses. I wonder if his order is subject to the new 1-for-2 rule.

I mentioned exceptions above. The order exempts:

  • regulations issued with respect to a military, national security, or foreign affairs function of the United States
  • regulations related to agency organization, management, or personnel
  • any other category of regulations exempted by the Director (of the OMB)

A savvy department head can probably make a good argument that any new regulation fits one of the first two. Failing that, being “in” with the Director of OMB will likely get you a pass.

Also, Section 5 states that the order will not “impair or otherwise affect”

  • the authority granted by law to an executive department or agency, or the head thereof
  • the functions of the Director relating to budgetary, administrative, or legislative proposals

Oh, and the last part, Section 5(c) says:

“This order is not intended to, and does not, create any right or benefit, substantive or procedural, enforceable at law or in equity by any party against the United States, its departments, agencies, or entities, its officers, employees, or agents, or any other person.”

In other words, this isn’t Law, but rather the president’s instructions to his subordinates.

The dog can’t bite; it can hardly growl. But the president can say that he “did something about the problem,” and thus get marks for keeping a campaign pledge.

So much for draining the swamp. This is the way things have been done in Washington for decades. Make a big deal signing a regulation with a feel-good title, but that does nothing (or, worse, does exactly the opposite of what you would expect), bask in the praise of your supporters, and then go about business as usual.

Welcome to the cesspool, Mr. President.

Tales from the trenches: It’s just a warning

Back in the bad ol’ days when we were writing Windows programs in C and directly calling the Windows API, the compiler would produce many warnings about one’s code. Conventional wisdom was to ignore the warnings. “If it was a real problem, the compiler would have issued an error.”

The idea behind compiler warnings was that they pointed out potential programming errors. For example, due to integer overflow, a conversion from unsigned integer to signed integer could be a problem if the unsigned int were large enough. A single compile might produce thousands of warnings, most of which weren’t actual problems. As a result, compiler warnings were treated like The Boy Who Cried Wolf.

But not all warnings are alike, and even some of those unsigned-to-signed warnings really were important. One day I realized that the compiler had produced warnings for some of the bugs that I’d been encountering, and I began writing my code to avoid the warnings. Surprisingly enough, I began to see fewer bugs in my programs.

Some time after I’d had my epiphany, I got a contract to rescue a failing project. The program was “feature complete,” but incredibly unstable. There was a huge number of bugs, and the project was falling further behind schedule. The team had experienced a lot of turnover, including the original lead and several of the senior developers, and they couldn’t give a reliable estimate of when they’d have the program working.

The first day I was there, I got my development system set up and compiled the program. As expected, there were over 1,000 compiler warnings. I spent a few hours looking through the code in question and found a half dozen warnings that were pointing out real problems with the code, and proved that modifying the code to eliminate the warnings resolved some reported bugs. The next day I met with the development team.

If you ever wrote Windows programs back then, you can imagine the response to my proclamation: “The first thing we’re going to do is eliminate all of the compiler warnings.”

“But there’s more than 1,000 warnings!”

“That will take us forever!”

And, the one I was waiting for: “They’re just warnings. If it was important the compiler would have issued an error.”

At that point, I trotted out my half dozen examples of warnings that pointed to actual program bugs. That got everybody’s attention.

It took us a week to eliminate those thousand-plus compiler warnings. In the process, we resolved dozens of reported bugs that were directly attributable to the warnings. By the end of the week the development team was in good spirits and the project manager thought I was a miracle worker.

The following Monday I met with the development team and said, “Now we’re going to Warning Level four.”

And I thought the previous week’s protests were bad! I had compiled the program on Level 4 and got over 2,000 warnings. Many of those truly are spurious: unreachable code, unused parameters and local variables, etc. But some of those warnings are important. For example, the compiler would issue a warning if the code used a variable before it was initialized. I had found a bug in the program that was caused by referencing an uninitialized pointer, but that warning only occurs on Level 4.

It took almost two weeks, and the development manager had to “counsel” one programmer who, rather than fixing the offending code, was instructing the compiler not to issue the warnings.

Not surprisingly, we resolved another few dozen bugs in the process and the code was a lot cleaner, too. A month after I started, we had a solid schedule to fix the remaining issues, and the product shipped six weeks later. The development team was happy and the manager thought I was a miracle worker. My contract had been for three months (13 weeks) but they let me go after 10 weeks, with an extra week’s pay as a bonus. To top things off, the manager set me up with a friend of his who also had a troubled project.

It was a good gig for a while, getting paid for making people pay attention to what their compilers were telling them. Good pay, easy work, and people thinking I was some kind of genius. It’s a weird world we live in.

The moral of the story, of course, is that you shouldn’t ignore compiler warnings. A more important lesson, which the C# team took to heart, is that compiler warnings are ambiguous. The C# language specification is much more rigorous than is the C language specification, and most things that were “just warnings” in C are definite errors in C#. For example, referencing an uninitialized variable is an error. There are fewer than 30 compiler warnings in the latest version (Visual Studio 2015) of the C# compiler. Contrast that to the hundreds of different warnings in a C or C++ compiler.

In my work, I compile with “Warnings as errors.” If the compiler thinks something is important enough to warn me about it, then I’m going to fix the code so that the compiler is happy with it. It’s usually an easy fix. I almost never have to disable the warning altogether.

Subtraction is not the same as comparison

I tracked down a bug the other day that really surprised me. Again, not because of the code itself, but rather that the person who wrote the code should have known better.

If you want to sort a list of a user-defined type in C# (and other languages), you need a comparison method. There are several different ways to do that, perhaps the simplest being to create a comparison delegate. For example, if you have this class definition:

    public class Foo
    {
        public int Id {get; private set;}
        public string Name {get; set;}
        // constructor and other stuff here . . .
    }

And a list of those:

    List<Foo> MyList = GetFooThings(...);

Now, imagine you want to sort the list in place by Id. You can write:
    MyList.Sort((x,y) => { return x.Id.CompareTo(y.Id); });

CompareTo will return a value that is:

  • Less than 0 if x.Id < y.Id
  • Equal to 0 if x.Id = y.Id
  • Greater than 0 if x.Id > y.Id

The code I ran into was similar, except the programmer thought he’d “optimize” the function. Rather than incurring the overhead of calling Int32.CompareTo, he took a shortcut:

    MyList.Sort((x,y)) => { return x.Id - y.Id; });

His thinking here is that the result of subtracting the second argument from the first will contain the correct value. And that’s true most of the time. It’s not at all surprising that his testing passed. After all, 12 - 3 returns a positive number, 300 - 123456 returns a negative number, and 99 - 99 returns 0.

But he forgot about negative numbers and integer overflow. Subtracting a negative number is like adding a positive number. That is, 4 - (-1) = 5. And in the artificial world of computers, there are limits to what we can express. The largest 32-bit signed integer we can express is Int32.MaxValue, or 2,147,483,647. If you add 1 to that number, the result is Int.MinValue, or -2,147,483,648. So the result returned by the comparison delegate above, when x is a very large number and y is a sufficiently small negative number is incorrect. It will report, for example, that 2,147,483,647 is smaller than -1!

There’s a reason that Int32.CompareTo is written the way it is. Here it is, directly from Microsoft’s Reference Source:

    public int CompareTo(int value) {
        // Need to use compare because subtraction will wrap
        // to positive for very large neg numbers, etc.
        if (m_value < value) return -1;
        if (m_value > value) return 1;
        return 0;
    }

It seems like I run into these fundamental errors more often now than I did in the past. I’m yet sure of the reason. I’ve been active in the online programming community for 30 years, and have seen a lot of code written by programmers of all skill levels. Early on, the average skill level of the programmers I met online was much higher than it is today. These days, every Java programming student has access to Stack Overflow, and I see a lot of these rookie mistakes. But I also see many such mistakes made by more experienced programmers.

I’m wondering if the difference is one of education. When I started programming, we learned assembly language early on and actually used it to write real programs. Two’s complement integer representation was a lesson we learned early, and integer overflow was something we dealt with on a daily basis. We would not have made this mistake. No, we made much more interesting mistakes.

This error can be caught, by the way. If you enable runtime overflow checking in C#, the runtime will throw an exception if the result of an arithmetic operation results in overflow or underflow. But that check is almost always turned off because it can have a large negative impact on performance, and in many places we depend on integer overflow for our algorithms to work correctly. It’s possible to disable overflow checking for a specific expression or block of code, but in my experience that functionality is rarely used. You can also disable the check globally but enable it for specific expressions or blocks of code, but doing so requires that you understand the issue. And if a programmer understood the issue, he wouldn’t have written the erroneous code in the first place.

Errors like this can exist in a working program for years, and then crop up at the most inopportune time when somebody uses the code in a new way or presents data that the original programmer didn’t envision. As much as our languages and tools have evolved over the past 30 years, in some ways we’re still distressingly close to the metal.

Pairing heap representation

Up to this point, I’ve been showing paring heap nodes as having a child list. Conceptually, the node structure looks like this:

    class HeapNode
        public int Data;
        public HeapNode[] Children;
    }

That’s a good conceptual model, but implementing that model can be unwieldy, and can consume a huge amount of memory. When working with small objects in managed languages like .NET, memory allocation overhead can easily exceed the memory used for data. That’s especially true of arrays, which have a 56-byte allocation overhead.

Granted, not all nodes in a pairing heap have children. In fact, at most half of them will. So we can save memory by not allocating an array for the children if there aren’t any. But that adds some complexity to the code, and at best saves only half of the allocation overhead.

Using the .NET List<T> collection doesn’t help, because List<T> uses an array as the backing store. LinkedList<T> will work, but involves its own overhead what with having to manage LinkedListNode instances.

In short, managing a per-node list of children can be difficult.

In my introduction to the Pairing heap, I showed this figure:

    2
    |
    8, 7, 3
    |     |
    9     4, 5
          |
          6

2 is the root of the tree. It has child nodes 8, 7, and 3. 9 is a child of 8. 4 and 5 are children of node 3, and 6 is a child of 4.

More traditionally, that tree would be drawn as:

                   2
                 / | \
                8  7  3
               /     / \
              9     4   5
                   /
                  6

That’s the traditional view of the tree. But we can also say that 8 is the first child of 2, 7 is the sibling of 8, and 3 is the sibling of 7. That is, every node has a reference to its first child, and to its next sibling. The node structure, then, is:

    class HeapNode
        public int Data;
        public HeapNode FirstChild;
        public HeapNode Sibling;
    }

As it turns out, there’s a well known binary tree structure called Left-child-right-sibling. Any traditional tree structure can be represented as such a binary tree. Our tree above, when represented as a left-child-right-sibling binary tree, becomes:

                 2
                /
               8
              / \
             9   7
                  \
                   3
                  /
                 4
                / \
               6   5

You might notice that this structure bears striking similarity to the original figure from my introduction. It’s the same thing, only rotated 45 degrees clockwise.

As you can see, this builds a very unbalanced binary tree. That’s okay, since we’re not searching it. With pairing heap, all of the action is at the first few tree levels of the tree. A deep tree is good, because it means that we rarely have many siblings to examine when deleting the smallest node.

Implementing a Pairing heap in a left-child-right-sibling binary tree is incredibly easy. Next time I’ll show a simple implementation in C#.

But that’s the way we’ve always done it!

It’s surprising how many Computer Science articles and textbooks, when showing how to implement a binary heap in an array, present code that has the root node at index 1 in C-like languages where the first element of an array is at index 0. In those implementations, element 0 in the array is unused.

This bothers me, not because the code wastes an insignificant amount of memory, but because it leads to a lot of confusion among students and junior programmers who are trying to implement a binary heap. Off by one errors abound, the first being in allocating the heap array. Here’s a common error that occurs when allocating a heap to hold 100 integers.

    int[] heap = new int[100];

We’re conditioned from the very start of our programming education to begin counting at 0. Every time we have stuff in an array or list, the first thing is at index 0. Every array-based algorithm we work with reinforces this lesson. We’re taught that some languages used to start at 1, but those heretical languages have nearly been eliminated from the world. And then they foist this folly on us: a heap’s root is at index 1.

We’re taught that 100 items means indexes 0 through 99. It’s beat into our heads on a daily basis when we’re learning programming. Then they trot out this one exception, where we have to allocate an extra item and count from 1 to 100 rather than 0 to 99 like normal programmers do.

Some people say, “but if you start at 0, then the calculations to find the children won’t work.” Well, they’re partially right. If the root is at index 1, then the left child of the node at index x is at index (x * 2), and the right child is at index (x * 2) + 1. The parent of the node at index x is at index x/2. They’re right in that those calculations don’t work if you move the root to index 0. But the changes required to make the calculations work are trivial.

If the root is at index 0, then the left child is at (2 * x) + 1, right child at (2 * x) + 2, and parent at (x-1)/2.

The hard core optimizer will tell you that the extra addition in computing the left child, and the extra subtraction when computing the parent will incur a big performance hit. In truth, in the context of a real, working, computer program, the performance difference is down in the noise. It’s highly unlikely that your program’s overall performance will suffer from the addition of a couple of ADD or SUB instructions in a binary heap implementation. If it does, then you’re doing something horribly wrong. Doing something stupid in order to save a few nanoseconds here and there is … well … stupid.

No, I think the real reason we continue this madness is historical. The first known reference to a binary heap is in J.W.J. Williams’ implementation of Heapsort (Williams, J.W.J. (1964), ‘Algorithm 232: Heapsort’, Communications of the ACM 7 (6), 347-348). Yes, that’s right: 52 years ago. His code sample, like all ACM code samples at the time, was written in Algol, a language in which array indexes start at 1.

Textbooks with code samples in Algol and, later, Pascal, perpetuated the idea that the root of a binary heap must be at index 1. It made sense, what with 1 being the first index in the array. When those books were revised in the 1990s and the examples presented in C, Java, C++, etc., a literal conversion of the code maintained the 1-based root even though arrays in those languages are 0-based. Nobody stopped to consider how confusing that empty first element can be to the uninitiated.

What I find disturbing is that whoever did the code conversions almost certainly ran into an off by one problem when testing the examples. But rather than spend time to rewrite the code, they “fixed” the problem by allocating an extra item, and maybe explained it away in the text as something that just has to be done to keep the root at index 1. Those few who actually understood the issue seem to have been too lazy to make the correction, opting instead to explain it away as a performance issue.

In so doing, they’ve done their audiences a grave disservice. I find that inexcusable.

An interesting interview question

I ran across a problem today that I think would make an excellent interview question.

You work in a warehouse that has N storage compartments, labeled D-1 through D-N. Each storage compartment can hold one shipping crate. There’s also a temporary holding area, called D-T, that can hold a single crate. There also are N crates, labeled C-1 through C-N, each of which is in a storage compartment.

There is a forklift that can pick up a single crate, move it to an empty position, and put it down.

The owner, after touring the facility, decreed that the crates must be rearranged so that crate C-x is in space D-x (C-1 in D-1, C-2 in D-2, … C-N in D-N).

What algorithm would you employ to rearrange the crates in the minimum number of moves?

This is a fairly simple problem. If you pose it as a “balls and bins” problem, with a physical model to play with, most people will develop the optimum algorithm in a matter of minutes. But pose it as a programming problem and a lot of programmers have trouble coming up with the same algorithm.

If you’re interested in trying the problem yourself, cut five squares (crates) from a piece of paper, and label them C1 through C5. Draw six boxes on another sheet of paper, and label them D1 through D6. Now, arrange the crates on the spaces in this order: C4, C1, C5, C3, C2. Leave D6 empty. Now, rearrange them. Remember, if you pick up a card you have to place it in an empty spot before you can pick up another card.

So if this problem is so simple, why is it such an interesting interview question and why do programmers have trouble with it?

One reason is that it’s not really a sorting problem. At least not in the traditional sense. You see, I’m not asking the candidate to write a sort. In fact, I’m not asking him to write a program at all. I’m asking him to develop an algorithm that will rearrange the crates in a specific order, given the constraints. The candidate’s first reaction tells me a lot about how he approaches a problem. Granted, I have to take into account that the candidate will expect a whiteboard programming problem, but if his immediate reaction is to stand up and start writing some kind of sort method, I can pretty much guarantee that he’s going down the wrong track.

The algorithm that uses the fewest moves is not one that you would typically write. It’s either computationally expensive, or it uses additional extra memory. And there’s a well known efficient solution to sorting sequential items. That algorithm works well in the computer, but is less than optimum when translated to the physical world where it takes time for a forklift to pick up an item and move it.

The well known algorithm starts at the first storage compartment, D-1. If the item in that position is not crate C-1, it swaps the crate with the crate in the position that it needs to be in. It continues making swaps until C-1 is in D-1, and then moves on to compartment D-2 and does the same thing. So, for example, if you have five crates that are arranged in the order [4,1,5,3,2,x] (the x is for the empty spot, initially D-T) the sequence of swaps is:

    Swap 4 with 3 (the item that's in position D-4), giving [3,1,5,4,2,x]
    Swap 3 with 5, giving [5,1,3,4,2,x]
    Swap 5 with 2, giving [2,1,3,4,5,x]
    Swap 2 with 1, giving [1,2,3,4,5,x]

At which point the programmer says, “Done!”

The programmer who develops this solution and calls it done might think again if he re-reads the problem statement. Note that nowhere did the algorithm use the temporary location, D-T. Either the programmer will smell a rat, or he’ll dismiss that temporary location as a red herring.

As it turns out, it’s not a red herring. A swap is actually a sequence of three separate operations. Swapping the contents of D-1 and D-4, for example, results in:

  1. Move the contents of D-1 to a temporary location.
  2. Move the contents of D-4 to D-1.
  3. Move the contents of the temporary location to D-1.

The well known solution to the problem–the solution that is in many ways optimum on the computer–results in 12 individual moves:

  1. Move crate C-4 from D-1 to D-T.
  2. Move crate C-3 from D-4 to D-1.
  3. Move crate C-4 from D-T to D-4. (Crate C-4 is in position.)
  4. Move crate C-3 from D-1 to D-T.
  5. Move crate C-5 from D-3 to D-1.
  6. Move crate C-3 from D-T to D-3. (Crate C-3 is in position.)
  7. Move crate C-5 from D-1 to D-T.
  8. Move crate C-2 from D-5 to D-1.
  9. Move crate C-5 from D-T to D-5. (Crate C-5 is in position.)
  10. Move crate C-2 from D-1 to D-T.
  11. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  12. Move crate C-2 from D-T to D-2. (Crate C-2 is in position.)

In the worst case, of which this is an example, the algorithm makes N-1 swaps, or 3N-3 moves.

At this point, it might be useful to have a discussion about what problem we’re really trying to solve. The problem asked for the minimum number of moves. A reminder that we’re talking about a physical process might be in order. To programmers who are used to things happening almost instantly, a few extra moves here and there don’t really make a difference. The solution presented above asymptotically optimum in that the number of moves required is linearly proportional to the number of crates. That’s typically a good enough answer for a programming question. But, again, this isn’t really a programming question. One can assume that the client wants to make the minimum number of moves because he has to pay a forklift operator $15 per hour and it takes an average of five minutes for every move. So if he can reduce the number of moves from 12 to 6, he saves $30.

The solution that most people presented with the balls and bins problem develop does things a bit differently. Rather than starting by moving crate C-4 to its rightful place, the algorithm starts by emptying the first bin (i.e. moving C-4 to D-T), and then filling the empty location with the item that belongs there. Then, it fills the new empty slot with the element that belongs there, etc. Let’s see how it works, again starting with [4,1,5,3,2,x].

  1. Move crate C-4 from D-1 to D-T.
  2. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  3. Move crate C-2 from D-5 to D-2. (Crate C-2 is in position.)
  4. Move crate C-5 from D-3 to D-5. (Crate C-5 is in position.)
  5. Move crate C-3 from D-4 to D-3. (Crate C-3 is in position.)
  6. Move crate C-4 from D-T to D-4. (crate C-4 is in position.)

That took six moves, or half of what the “optimum” solution took. That’s not quite a fair comparison, though, because that’s not the worst case for this algorithm. The worst case occurs when moves result in a cycle that leaves D-T empty before all of the items are in order. Consider the sequence [2,1,5,3,4,x]. Using our new algorithm, we start with:

  1. Move crate C-2 from D-1 to D-T.
  2. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  3. Move crate C-2 from D-T to D-2. (Crate C-2 is in position.)
    At this point, our temporary spot is empty, so we need to make a new empty spot by moving the first out of place crate to D-T.
  4. Move crate C-5 from D-3 to D-T.
  5. Move crate C-3 from D-4 to D-3. (Crate C-3 is in position.)
  6. Move crate C-4 from D-5 to D-4. (Crate C-4 is in position.)
  7. Move crate C-5 from D-T to D-5. (Crate C-5 is in position.)

It’s interesting to note that the swapping algorithm requires three swaps (nine moves) to rearrange items given this initial arrangement.

As far as I’ve been able to determine, the worst case for this algorithm requires 3N/2 moves. It will never make more moves than the swapping algorithm, and often makes fewer.

The point of posing the problem to the candidate is to get him to develop an approach to the problem before he starts coding.

Another interesting thing about this question is that it allows for a follow-on question. Assuming that the candidate develops the optimum algorithm, ask him to write code that, given the starting position, outputs the moves required to rearrange the crates. In other words, implement the algorithm in code.

At this point, he’ll have to make a decision: use sequential search to locate specific crates, or create an index of some kind to hold the original layout so he can quickly look up an individual crate’s location. Again, he should be asking questions because the requirements are deliberately ambiguous. In particular, he might think that he can’t use the index because we only allowed him one extra swapping space for the crates. But again, the original problem statement doesn’t mention a computer program, and the follow-on question doesn’t place any restrictions on the memory consumption.

I’ll leave implementation of the algorithm as an exercise for those who are interested. It’s not difficult, just a bit odd for those who are used to writing traditional sorting algorithms.

Ultimately, the reason I find this problem a potentially useful interview question is because it forces the candidate to differentiate between the customer’s requirements (the optimum sequence of steps required to arrange his crates), and then internal implementation details. The extent to which the candidate can make that differentiation tells me a lot about how she will approach the real-world problems that we’ll present to her on a daily basis. In addition, the simple problem is rife with opportunities to have other discussions about approaches to problems and common coding tradeoffs.

It occurred to me as I was writing this that I should construct a balls and bins prop that I can trot out if the candidate doesn’t trip to the algorithm relatively quickly. It would be interesting to see how quickly the candidate, if given the prop, would discover the optimum algorithm.

I’m kind of excited about getting an opportunity to try this out.

Null is evil, Chapter 9346

If we’ve learned anything from 45 years of C coding, it’s that NULL is evil. At least relying on NULL return values is usually evil, especially when it forces you to write contorted special case handling code. I ran across a case in point today that surprised me because the programmer who wrote it should have known better. And for sure the senior programmer who reviewed the code should have flagged it for rewriting.

There is a method that takes an existing object and updates it based on fields in another object. The code writes a change log entry for each field updated. The code is similar to this:


    private bool UpdatePerson(Person person, Models.Person personUpdate)
    {
        var changes = new List<ChangeLogEntry>();
        
        changes.Add(CompareValues(person.FirstName, personUpdate.FirstName, "First Name"));
        changes.Add(CompareValues(person.LastName, personUpdate.LastName, "Last Name"));
        // ... other fields here
        
        changes = changes.Where(x => x != null).ToList();  // 

This is Just Wrong. It works, but it’s wonky. The code is adding null values to a list, knowing that later they’ll have to be removed. Wouldn’t it make more sense not to insert the null values in the first place?

One way to fix the code would be to write a bunch of conditional statements:


    ChangeLogEntry temp;
    temp = CompareValues(person.FirstName, personUpdate.FirstName, "First Name");
    if (temp != null) changes.Add(temp);
    
    temp = CompareValues(person.LastName, personUpdate.LastName, "Last Name");
    if (temp != null) changes.Add(temp);
    
    // etc., etc.

That’s kind of ugly, but at least it doesn’t write null values to the list.

The real problem is the CompareValues function:


    private ChangeLogEntry CompareValues(string oldValue, string newValue, string columnName)
    {
        if (oldValue != newValue)
        {
            var change = new ChangeLogEntry(columnName, oldValue, newValue);

            return change;
        }

        return null;
    }

Yes, it’s poorly named. The method compares the values and, if the values are different, returns a ChangeLogEntry instance. If the values are identical, it returns null. Perhaps that’s the worst thing of all: returning null to indicate that the items are the same. This model of returning null to indicate status is a really bad idea. It forces clients to do wonky things. Of all the horrible practices enshrined in 40+ years of C-style coding, this one is probably the worst. I’ve seen and had to fix (and, yes, written) too much terrible old-style C code to ever advocate this design pattern. null is evil. Avoid it when you can.

And you usually can. In this case it’s easy enough. First, let’s eliminate that CompareValues function:


    if (person.FirstName != personUpdate.FirstName) changes.Add(new ChangeLogEntry(person.FirstName, personUpdate.FirstName, "First Name"));
    if (person.LastName != personUpdate.LastName) changes.Add(new ChangeLogEntry(person.LastName, personUpdate.LastName, "Last Name"));

That’s still kind of ugly, but it eliminates the null values altogether. And we can create an anonymous method that does the comparison and update, thereby removing duplicated code. The result is:


    private void UpdatePerson(Person person, Models.Person personUpdate)
    {
        var changes = new List<ChangeLogEntry>();
        
        // Anonymous method logs changes.
        var logIfChanged = new Action<string, string, string>((oldValue, newValue, columnName) =>
        {
            if (oldValue != newValue)
            {
                changes.Add(new ChangeLogEntry(columnName, oldValue, newValue));
            }
        }
        
        logIfChanged(person.FirstName, personUpdate.FirstName, "First Name");
        logIfChanged(person.LastName, personUpdate.LastName, "Last Name");
        // ... other fields here

        if (changes.Any())
        {
            // update record and write change log
        }
    }

That looks a lot cleaner to me because there are no null values anywhere, and no need to clean trash out of the list.

Here’s the kicker: I didn’t rewrite the code. If the code were as simple as what I presented here, I probably would have changed it. But the code is a bit more complex and it’s currently running in a production environment. I hesitate to change working code, even when its ugliness causes me great mental pain. I’m almost certain that I can make the required changes without negatively affecting the system, but I’ve been burned by that “almost” too many times with this code base. If I had a unit test for this code, I’d change it and depend on the test to tell me if I screwed up. But there’s no unit test for this code because … well, it doesn’t matter why. Let’s just say that this project I inherited has a number of “issues.” The one I’ll relate next time will boggle your mind.

Efficiency of Pairing heap

Last time, I introduced the Pairing heap, and showed an example of its structure and how it changes as items are added and removed. At first glance it seems unlikely that this can be faster than the binary heap I spent so much time exploring three years ago. But it can be. Let me show you how.

As I mentioned in the discussion of the binary heap, inserting an item takes time proportional to the base-2 logarithm of the heap size. If there are 100 items in the heap, inserting a new item will take, worst case, O(log2(100)) operations. Or, about 7 swaps. It could be fewer, but it won’t be more. In addition, removing the minimum item in a min-heap will require O(log2(n)) operations. In a binary heap, insertion and removal are O(log n).

As you saw in my introduction to the Pairing heap, inserting an item is an O(1) operation. It never involves more than a comparison and adding an item to a list. Regardless of how many items are already in the heap, adding a new item will take the same amount of time.

Removal, though, is a different story altogether. Removing the smallest item from a Pairing heap can take a very long time, or almost no time at all.

Consider the Pairing heap from the previous example, after we’ve inserted all 10 items:

    0
    |
    6,4,3,5,8,9,7,2,1

It should be obvious that removing the smallest item (0) and re-adjusting the heap will take O(n) operations. After all, we have to look at every item during the pair-and-combine pass, before ending up with:

    1
    |
    2, 8, 3, 4
    |  |  |  |
    7  9  5  6

But we only look at n/2 items the next time we remove the smallest item. That is, we only have to look at 2, 8, 3, and 4 during the pair-and-combine pass. The number of items we look at during successive removals is cut in half with each removal, until we get to two or three items per removal. Things fluctuate a bit, and it’s interesting to write code that displays the heap’s structure after every removal.

By analyzing the mix of operations required in a very large number of different scenarios, researchers have determined that the amortized complexity of removal in a Pairing heap is O(log n). So, asymptotically, removal from a Pairing heap is the same complexity as removal from a binary heap.

It’s rather difficult to get a thorough analysis of the Pairing heap’s performance characteristics, and such a thing is way beyond the scope of this article. If you’re interested, I suggest you start with Towards a Final Analysis of Pairing Heaps.

All other things being equal, Pairing heap should be faster than binary heap, simply because Pairing heap is O(1) on insert, and binary heap is O(log n) on insert. Both are O(log n) on removal. Keep in mind, though, that an asymptotically faster algorithm isn’t always faster in the real world. I made that point some time back in my post, When theory meets practice. On average, Pairing heap does fewer operations than does binary heap when inserting an item, but Pairing heap’s individual operations are somewhat more complex than those performed by binary heap.

You also have to take into account that Pairing heap can do some things that binary heap can’t do, or can’t do well. For example, the merge (or meld) function in Pairing heap is an O(1) operation. It’s simply a matter of inserting the root node of one heap into another. For example, consider these two pairing heaps:

    1                  0
    |                  |
    2, 8               5
    |
    4

To merge the two heaps, we treat the root nodes just as we would two siblings after removing the smallest item. We pair them to create:

    0
    |
    1, 5
    |
    2, 8
    |
    4

With binary heap, we’d have to allocate a new array that’s the size of both heaps combined, copy all items from both heaps to it, and then call the Heapify method to rebuild the heap. That takes time proportional to the combined size of both heaps.

And, as I mentioned previously, although changing the priority of an item in a binary heap isn’t expensive (it’s an O(log n) operation), finding the node to change is an O(n) operation. With Pairing heap, it’s possible to maintain a node reference, and it’s been shown (see the article linked above) that changing the priority of a node in a Pairing heap is O(log log n).

Another difference between Pairing heap and binary heap is the way they consume memory. A binary heap is typically implemented in a single contiguous array. So if you want a heap with two billion integers in it, you have to allocate a single array that’s 8 gigabytes in size. In a Pairing heap, each node is an individual allocation, and each node contains two pointers (object references). A Pairing heap of n nodes will occupy more total memory than a binary heap of the same size, but the memory doesn’t have to be contiguous. As a result, you might be able to create a larger pairing heap than you can a binary heap. In .NET, for example, you can’t create an array that has more than 2^31 (two billion and change) entries. A Pairing heap’s size, however, is limited only by the available memory.

The Pairing Heap, Part 1

A key takeaway from my early discussion of heaps is that if all we need is the ability to get the smallest item whenever we reach into the bag, then there’s no need to keep the entire bag in sorted order if some less expensive alternate ordering will work. Because of that relaxed requirement, we can implement a priority queue with a heap a whole lot more efficiently than with a sorted list.

I like to call it creative laziness.

Binary heaps are useful and quite efficient for many applications. For a pure priority queue, it’s hard to beat the combination of simplicity and efficiency of a binary heap. It’s easy to understand, easy to implement, and performs quite well in most situations. But the simplicity of the binary heap makes it somewhat inflexible, and also can prove to be inefficient when heaps become very large.

Even though the binary heap isn’t sorted, its ordering rules are pretty strict. If you recall, a valid binary heap must satisfy two properties:

  • The heap property: the key stored in a node is greater than or equal to the key stored in the parent node.
  • The shape property: it’s a complete binary tree.

Maintaining the shape property is required if you want to efficiently implement the binary heap in a list or array, because it lets you implicitly determine the locations of the child or parent nodes. But nothing says that you have to implement a binary heap in an array. You could use a more traditional tree representation, with explicit nodes that have right, left, and parent node references. That has some real benefits.

For example, a common extension to the priority queue is the ability to change the priority of an item in the queue. It’s possible to do that with a binary heap, and in fact isn’t difficult. It’s inefficient, though, because in a binary heap it takes a sequential search to find the item before you can change its priority.

But if you have a traditional tree representation, then you can maintain direct references to the nodes in the heap. At that point, there’s no longer a need for the sequential search to find the node, and changing an item’s priority becomes a pure O(log n) operation. Algorithms such as the A* search algorithm typically use something other than a binary heap for their priority queues, specifically because changing an item’s priority is a common operation.

Another common heap operation is merging (also known as melding): combining two heaps into one. If you have two binary heaps, merging them takes time proportional to the combined size of both heaps. That is, it takes O(n + m) time. The basic idea is to allocate an array that’s the size of both heaps, put all the items into it, and then call a BuildHeap function to construct the heap. With a different heap representation, you can do the merge in constant time.

It turns out that maintaining the shape property when you go to a more traditional tree representation is needlessly expensive. If you eliminate the need for the shape property, it becomes possible to indulge in even more creative laziness.

And, my oh my, have people been creative. Here’s a partial list of heap implementations.

There’s a lot of discussion about which of those is the “fastest” or “best” heap implementation. A lot of it involves theoretically optimum asymptotic behavior, and can get pretty confusing for those of us who just want a working heap data structure that performs well in our applications. Without going into too much detail, I’ll just say that some of those heap implementations that have theoretically optimum performance in some areas (the Brodal queue, for example) are difficult to implement and not at all efficient in real-world situations.

Of those that I’ve worked with, I particularly like Pairing heap because of its simplicity.

The pairing heap’s shape is much different than that of a binary heap. Rather than having right and left child references, a pairing heap node maintains child lists. It maintains the heap property in that any child in a min-heap is greater than or equal to its parent, but the concept of shape is thrown out the window. For example, consider this binary heap of the numbers from 0 to 9.


                    0
              1           3
          5       2   7       4
        9   6   8

Again, in the binary heap, each node has up to two children, and the children are larger than the parents. Although there are multiple valid arrangements of values possible for a binary heap, they all maintain that shape property.

A Pairing heap’s structure is much more flexible. The smallest node is always the root, and children are always larger than their parents, but that’s where the similarity ends. The best way to describe the structure is to go through an exercise of adding and removing items.

If you start with an empty heap and add an item, then that item becomes the root. So, let’s add 6 as the first item in the heap:

    6

Now, we add the value 3. The rules for adding are:

  1. If there is no root, then the new item becomes the root.
  2. If the new item is greater than or equal to the root, then the new item is added to the root item’s child list.
  3. If the new item is smaller than the root, then the new item becomes the root, the old root’s child list is appended to the new root’s child list, and the old root is appended to that child list.

In this case, 3 is smaller than 6, so 3 becomes the root and 6 is added as a child of 3.

    3
    |
    6

Here, the vertical bar character (|) is used to denote a parent-child relationship. So node 6 is a child of node 3.

Now we add 4 to the heap. Since 4 is larger than 3, 4 is added to the child list.

    3
    |
    6,4

When we add 2 to the heap, 2 becomes the new root, and 3 is appended to the child list:

    2
    |
    6,4,3

I’ll add 5, 8, 9, and 7 in that order, giving:

    2
    |
    6,4,3,5,8,9,7

Adding 0 makes 0 the root, and then adding 1 appends to the child list:

    0
    |
    6,4,3,5,8,9,7,2,1

Note that the heap property is maintained. The smallest item is at the root, and child items and their siblings are larger than their parents. Because the smallest item is at the root, we can still execute the get-minimum function in constant time.

And in fact, insert is also a constant time operation. Inserting an item consists of a single comparison, and appending an item to a list.

Things get interesting when we want to remove the smallest item. Removing the smallest item is no trouble at all, of course. It’s figuring out the new root that gets interesting. This proceeds in two steps:

First, process the children in pairs from left to right, creating 2-node heaps with the lowest item at the root, and the other item as a child. So using the child list above, we would create:

    4 3 8 2 1
    | | | |
    6 5 9 7

So 6 is a child of 4, 5 is a child of 3, etc. In this case, 1 doesn’t have a child.

Then, we process from right to left, taking the lower item as the root and adding the other item as a child of the root. Starting with 1 and 2, we combine them to create:

    4 3 8 1
    | | | |
    6 5 9 2
          |
          7

Proceeding to compare 1 and 8:

    4 3 1
    | | |
    6 5 2, 8
        |  |
        7  9

Note here that 2 and 8 are siblings, but 7 is a child of 2, and 9 is a child of 8. 9 and 7 are not siblings.

When we’re done processing, we have:

    1
    |
    2, 8, 3, 4
    |  |  |  |
    7  9  5  6

Again, the heap property is maintained.

It’s very important to understand that the heap property is maintained. All child nodes are larger than their parent nodes. Siblings (children of the same node) are represented in an unordered list, which is just fine as long as each sibling is larger than the parent node. Also, a node’s children only have to be larger than their parent node. Being smaller than a parent’s sibling does not violate the heap property. In the figure above, for example, 5 is larger than its parent, 3, but smaller than its parent’s sibling, 8.

Removing 1 from the heap leaves us with

    2, 8, 3, 4
    |  |  |  |
    7  9  5  6

Again, we pair from left to right. 2 is smaller than 8, so 2 becomes the root of the subtree. But what happens to the children? They’re appended to the child list of the other node. So pairing 2 and 8 results in:

    2
    |
    8, 7
    |
    9

And pairing the others gives:

    2      3
    |      |
    8, 7   4, 5
    |      |
    9      6

And combining from right to left makes 2 the root of the new tree:

    2
    |
    8, 7, 3
    |     |
    9     4, 5
          |
          6

Again, you can see that the heap property is maintained.

Now, when we remove 2 from the heap, we first pair 8 and 7 to create:

    7     3
    |     |
    8     4, 5
    |     |
    9     6

And then combine to form:

    3
    |
    4, 5, 7
    |     |
    6     8
          |
          9

When 3 was selected as the smallest item, node 7 was appended to the child list.

That’s the basics of the Pairing heap. You might be asking yourself how that can possibly be faster than a binary heap. That will be the topic for next time.

Bad Boolean interfaces

Today I got a bug report saying that an error occurred while trying to save a record. That there was an error in this code I inherited isn’t unusual. It was odd, though, to encounter an error in saving a record. That particular bit of code executes hundreds of times per day and I’d never seen a bug report. A few minutes looking at the logs led me to this snippet:

    var outlets = outletRepository.GetOutlets();
    var outletName = outlets.First(outlet => outlet.Id == myRecord.OutletId).Name;

For those of you who don’t speak C#, all that’s doing is getting the list of Outlet records from the database and locating the one that’s referenced by myRecord. We do it this way rather than just looking up the single outlet because we use the returned list of outlets later in the method.

At first I couldn’t figure out why that failed. After all, our database has referential integrity, so myRecord can’t possibly (as far as I know) reference an outlet that doesn’t exist.

It turns out that GetOutlets is defined as:

    public List<Outlet> GetOutlets(bool activeOnly = true)

So a call to GetOutlets() by default returns just the active outlets. Because the outlet referenced by this particular record was marked as inactive, that outlet was not returned by GetOutlets, and as a result the call to Enumerable.First threw an exception.

The most glaring error here is the use of First, which I contend should be used only if you know that what you’re looking for is somewhere in the list. Otherwise you probably should call FirstOrDefault and then check for a null return value.

A more insidious error is the design of the GetOutlets method. In the first place, using a Boolean value to determine whether the method returns all outlets or just the active outlets is a bad idea. true or false are rather ambiguous here. You can’t tell by looking at the value passed what attribute is being tested, or if the flag is used for something else entirely. Maybe true means “sort the records.” Programmers are somehow supposed to “know” what it means. In any case, if the parameter is required the programmer knows that he has to pass true or false in order for it to work. And one would assume that an unfamiliar programmer would check the parameter name and pass the appropriate value. But when you give the parameter a default value, you’ve effectively removed any roadblock to using the method incorrectly. Certainly the statement outlets = outletRepository.GetOutlets(); doesn’t convey any information about the parameter–or that it even exists.

Don’t use Boolean values like this. If you want two forms of the GetOutlets method, then write them.

    public List<outlet> GetAllOutlets();
    public List<outlet> GetOnlyActiveOutlets();

Another option, less preferred but possibly acceptable, would be to define an enumeration that explicitly states what you’re looking for:

    enum OutletFilterType
    {
        All,
        OnlyActive
    }

    public List <outlet> GetOutlets(OutletFilterType filter);

And to call it:

    outlets = GetOutlets(OutletFilterType.All);
    outlets = GetOutlets(OutletFilterType.OnlyActive);

But do not give that parameter a default value.

If you really must take shortcuts in your private implementation–a practice I strongly discourage, by the way, because private code has a distressing tendency to become public–then fine. Make a mess in your own house if you wish, but keep your public interfaces clean and unambiguous.

Every time I think I’m getting a handle on this code, I uncover another little code bomb. The code is littered with this GetXXX(true/false) pattern, and the use of First to get things from untrusted lists. I suppose you could call it job security, but I’m getting tired of it.

Delete the UPSERT

Many programmers who write code to work with databases like the concept of an “upsert” command: passing a record to the database and letting it figure out whether the record should be added or updated. If the record already exists in the database then it’s updated. Otherwise a new record is created. It’s very convenient.

It’s also a terrible idea. At least, that’s my experience. Many years ago I argued strongly for an “upsert” command, and on one project I created my data access layer with such a thing. We weren’t using stored procedures on that project, so the “check and update if there” logic was written in code. It worked. It also caused no end of problems, and I swore that I’d never use the upchuck (err … upsert) model again. And I didn’t, until I inherited this project.

The project I’m working on uses the upsert model throughout. The undocumented but well-understood convention is that none of our database tables have a record id of 0. So if I pass a record with id 0 to an upsert stored procedure, the record won’t be found in the table and a new record is inserted. This is convenient for the guy who writes the database access code because he only has to write one method to handle both insert and update. When you’re writing CRUD code for tables that have dozens of fields, that seems like a real savings: you write a whole lot less code.

But it comes at a cost that many don’t recognize: clients who call the upsert method have to know that record id 0 has a special meaning. That’s fine for the original development team, but it’s not at all obvious to a programmer who comes on to the project some time after the program is in production.

It gets worse. In our project we have a Categories table that contains category names and other information. Each Reward that we create belongs to exactly one category, the identifier of which is stored in the Rewards record. There are UI screens that allow system administrators to add new categories. The project is pretty mature now, so we don’t often add new categories.

Unfortunately, whoever wrote those UI screens wasn’t too hip on things. To create the drop-down list of categories, one would typically get all of the categories from the database and create a combo box (drop down) that lets the user select a category from the list. Sometimes the dropdown will have an initial value something like “Select a category,” with a default value. Usually, that label is added in code after getting the list of categories from the database. But not in this case. No, the genius who designed these screens decided that he’d create a category with an id of 0, and text “Select a category.” And he decided that his “no record” id would be -1.

So for Categories and a few other configuration tables, you pass -1 for the record id if you want to insert a new record. Passing 0 just overwrites the existing “Select a category” record.

So now the rule is “Passing a record id of 0 inserts a new record. Except for tables Foo, Bar, and Idiot, for which you must pass a record id of -1.” (Actually, any negative number will do, but I digress.)

Yeah. That’s like totally obvious. Not!

So when we did some refactoring in that area, one of my guys, relatively new to the project, made the reasonable assumption that 0 meant “new record” for this table just like it does for every other table. Imagine our surprise when QA told us that adding a new category just overwrote the record with ID 0.

I’ll grant that the programmer who introduced this bug is at fault for not noticing the difference. I put more blame the programmer who thought it’d be a good idea to special case these few tables for making the mistake not only possible but almost guaranteed to happen. But ultimately the blame goes to whoever decided that the Upsert model was a good idea. If we had separate Insert and Update methods and stored procedures, then there would be no question what was happening. With a separate Insert method, you’d never pass a record id, which would make it impossible to inadvertently do an Update when you meant to do an Insert.

One argument programmers make for the use of Upsert is that, with it, the create new record and edit existing record functions can be identical: initialize the record (either by reading an existing one from the database, or creating a new one with default values), and call the Edit method. When editing is done, just call Upsert and things “just work.” I’ve written code like that. I’ve also been bitten by it because that common Edit method always ends up needing a flag to indicate whether you’re creating new or editing existing, and the method is peppered with if (IsEdit) conditionals. In the end, the “common” Edit method is hugely complex and very fragile. It’s been my experience that the temptation to combine create and edit functionality is very strong, but giving in to it is the first step on a long and painful road.

Upsert is evil. It might be useful on the database end, but it has no use in a data access layer or anywhere else a mere mortal programmer  might be. The programmer should know whether his code is editing an existing or creating a new record, and his code should call the Update or Insert method, as appropriate.

 

Colorado flag specifications

Yes, I know. It’s only been … six months or so since I posted here. I had some trouble with my ISP, and I got busy with other things. Perhaps I’ll write about some of that.

I found a buyer for my Texas flag coffee table. At least, he said he was coming over with a check. I won’t say it’s sold until I get the cash, but I’m pretty confident that I’ve made the sale.

txTable

The buyer wanted me to add some bracing to the legs, which is why the lower supports. I think it’d be pretty hard to break the joint at the top of the legs, but this table will likely be subjected to a lot of abuse aboard a party boat. I was happy to make the modification if it meant a sale.

With that one gone, I thought I’d make another table.  This one, though, will have a Colorado state flag.

Colorado state flag

Why Colorado? Two reasons:

  1. I grew up there, and love the place.
  2. It’s another easy flag, like the Texas flag and the American flag.

I need a scalable vector graphic (.svg) file for the flag, and whereas the ones I’ve downloaded look okay to the eye, every one I’ve examined seemed slightly imperfect. I need that “C” to be centered perfectly, or my flag is going to look kind of wonky. At least, that’s what I thought.

So I went looking for the flag’s specifications. Here’s the language from the Colorado Revised Statutes, Title 24, Article 80, Part 9.

A state flag is hereby adopted to be used on all occasions when the state is officially and publicly represented, with the privilege of use by all citizens upon such occasions as they may deem fitting and appropriate. The flag shall consist of three alternate stripes to be of equal width and at right angles to the staff, the two outer stripes to be blue of the same color as in the blue field of the national flag and the middle stripe to be white, the proportion of the flag being a width of two-thirds of its length. At a distance from the staff end of the flag of one-fifth of the total length of the flag there shall be a circular red C, of the same color as the red in the national flag of the United States. The diameter of the letter shall be two-thirds of the width of the flag. The inner line of the opening of the letter C shall be three-fourths of the width of its body or bar, and the outer line of the opening shall be double the length of the inner line thereof. Completely filling the open space inside the letter C shall be a golden disk; attached to the flag shall be a cord of gold and silver intertwined, with tassels one of gold and one of silver. All penalties provided by the laws of this state for the misuse of the national flag shall be applicable to the said state flag.

Yeah, that’s a mouthful. Let’s take a closer look at the dimensions:

  1. “The flag shall consist of three alternate stripes to be of equal width and at right angles to the staff,”
    Easy enough.
  2. “…the proportion of the flag being a width of two-thirds of its length.”
    So if the flag is 1 unit long, then the flag is 2/3 unit wide. That makes each stripe 1/3 of 2/3 units, or 2/9 units, which is … my head explodes. Let’s say that each stripe is one unit wide. That makes the flag three units wide and 4.5 units long. That should be easier to deal with.
  3. “At a distance from the staff end of the flag of one-fifth of the total length of the flag there shall be a circular red C…”
    So the left edge of the ‘C’ is 4.5/5, or 0.9 units from the left edge.
  4. “The diameter of the letter shall be two-thirds of the width of the flag.”
    Well that’s easy enough. The flag is three units wide, so the circle is two units in diameter.
  5. ” The inner line of the opening of the letter C shall be three-fourths of the width of its body or bar,”
    Now this is a problem. It says that the line is 3/4 the width of the letter’s body, but it doesn’t say how wide the body should be. And is that “inner line” the arc of the inner line? Or is it the length of the chord connecting the endpoints of that arc?
  6. “…and the outer line of the opening shall be double the length of the inner line thereof.”
    Not a problem, once I know the answer to #5.
  7. “Completely filling the open space inside the letter C shall be a golden disk;”
    And just how large should that disk be? We can infer it from #5, but that requirement is insufficiently specified.

If could be that #5 and #6 are sufficient to define the width of the letter and as a result the diameter of the inner golden disk. The Wikipedia article says, “On March 31, 1964, the legislature further dictated the diameter of the gold disc to be equal to the center stripe.” I don’t see that in the statute, so perhaps there is only one solution that satisfies the conditions. I’ll have to study the geometry.

Note also that nothing in the language says anything about the vertical placement of the ‘C’. As far as I can tell, that letter could be placed anywhere along that vertical line, 1/5 of the length from the left edge of the flag.

The Colorado flag with three stripes and the red ‘C’ with gold interior was first described by legislation in 1911. It was revised in 1929 to say that the red and blue must be the same colors as the U.S. flag, but the “golden” color doesn’t appear to be specified anywhere. The size of the ‘C’ wasn’t specified until 1964. I find it curious and somewhat amusing that the statute doesn’t define the vertical position, and that the rest of the description is so wonky. You’d think they would have hired a consultant to clarify the language so that anybody with a ruler and compass could easily draw the flag.

Good thing I paid attention in high school geometry, huh?

Drawing attention to or hiding errors

I started my programming career writing COBOL programs for banks. One of my early tasks had me writing a program that would send a letter to all of our loan customers, informing them of a change in the payment notices. Included in that letter was to be a sample of the new payment coupon, which was to be clearly marked as a sample so that nobody would think it was bill and send a payment.

My design for the sample coupon included the words NOT AN INVOICE and NO PAYMENT REQUIRED and DO NOT PAY FROM THIS SAMPLE printed in several places. In addition, the account number printed on the coupon was something innocuous like “123-456-789”, and the customer’s name and address on all the coupons were the same:

Stanley Steamroller
123 Main St.
Thule, Greenland

And the amount due was “$123.45”.

My boss had me take that to the branch manager for approval. The manager praised my good thinking for including the NOT AN INVOICE and other wording, and the obviously fake name and address. His comment: “I was worried that customers might complain about an extra payment notice, but what you have here is clearly a sample. Nobody will be confused by this.”

To my knowledge, nobody called to complain that they had already made their payment and that they didn’t appreciate this erroneous invoice. We did, however, receive several checks for $123.45, with the account number 123-456-789 written in the Memo field, nicely packaged up with the sample payment coupon. It’s fortunate that the checks had the senders’ addresses on them. Otherwise we would not have known who to contact.

The first lesson I learned from this experience is that some people see only what they expect to see ( “Oh, look, a loan payment notice from the bank. Guess I’ll pay it.”). Later (with a similar mailing some months later) I learned that if you want people to stop and think about what they’re looking at, make a glaring error. If I had made that amount $7,743,926.59, I suspect nobody would have sent a check. We might have had a few calls from irate customers saying that they couldn’t possibly owe seven million dollars on their $15,000 loan, but it’s likely that they’d examine the notice a little more carefully before picking up the phone.

If you want people to notice something in a document, make an error that’s impossible to miss. That’ll force them to look more carefully at the rest of the page.

Oddly enough, the converse of that is also true in some situations. When preparing for room inspections at military school, I’d purposely leave something out of order. I wouldn’t make it too obvious, but it’d be something that the upperclassmen always looked for, and that was commonly in error. I found that if I tried to make my room perfect, those guys would spend entirely too much time looking for something wrong. But if I made one or two easy-to-find errors, they’d find the discrepancy, mark it down, and then leave the room happy that they did their jobs.

I think the difference is expectation. When somebody sends you information that you assume to be correct (like a statement from your bank), a glaring error makes you examine the rest of the information more carefully. But an upperclassman who’s looking for an opportunity to gig a subordinate will stop as soon as he’s found an error or two. He has proven his superiority and he has other rooms to inspect.

I’ve heard that the tactic works for tax auditors, too: give him an obvious reason to make you pay a little extra tax, and he’ll give the rest of your records a cursory glance before declaring everything in order. He’s proven his worth, so he can pack up his calculator and head off to torture his next victim.

Leadership and development teams

When he can, a leader will explain to his subordinates the reasons for his orders. Not because he has to, but because he knows that people usually do a better job when they know why they’re doing it. In addition, the leader’s subordinates are then willing to follow orders that the leader can’t explain (usually due to time pressure or enforced secrecy) because they trust that he has good reasons.

The leader earns obedience through mutual trust and respect.

Leadership is about building and directing a team: a group of people who trust and respect each other, and work together towards a common goal. It’s also about giving team members the information and authority they need to get the job done, and then letting them do it. The person who insists on being involved in the minutiae of every team members’ job is no leader at all, but rather a meddler who kills morale and destroys the team’s performance.

The best leaders are those who appear to do nothing at all, but behind the scenes they’re getting the team members the information and access they need, and are clearing obstacles from the team’s path before the rest of the team members even know those obstacles exist.

A well-lead team can function without the leader for a surprisingly long time. If the leader has done his job, then his team’s way has been cleared of any looming obstacles, they have everything they need to complete their current tasks, and a good idea of what they’ll be doing next.

In short: the leader doesn’t do the work; he makes it possible for his team to do the work.

The above is as true for a programming team lead as it is for the leader of a military unit or a senior manager of a Fortune 500 company. The team’s jobs might be completely different, but the team leader’s job is essentially the same: create an environment that makes it possible for the team to get the job done, and then step back. Observe. Give encouragement and direction when necessary. But let the team members do the things they’ve been hired to do.

It’s been my experience that most software development teams do not have good leaders. It shows in missed release dates, high bug counts, frequent “hotfix” releases, team member dissatisfaction and flouting of imposed standards, and high levels of past-due technical debt. The result is bloated, crufty, unstable code characterized by tight coupling, low cohesion, quick fixes, special cases, and convoluted dependencies. All to the detriment of the product.

All too often, the “leader” of the team has upper management convinced that the type of software his team is building is somehow different, and those problems are unavoidable in the company’s particular case. He probably believes it himself. As long as management accepts that, they will continue to experience missed delivery dates and their users will continue to be unhappy.

It’s all about context

The C# using directive and implicitly typed local variables (i.e. using var) are Good Things whose use should be encouraged in C# programs, not prohibited or severely limited. Used correctly (and it’s nearly impossible to use them incorrectly), they reduce noise and improve understanding, leading to better, more maintainable code. Limiting or prohibiting their use causes clutter, wastes programmers’ time, and leads to programmer dissatisfaction.

I’m actually surprised that I find it necessary to write the above as though it’s some new revelation, when in fact the vast majority of the C# development community agrees with it. Alas, there is at least one shop–the only one I’ve ever encountered, and I’ve worked with a lot of C# development houses–that severely limits the use of those two essential language features.

Consider this small C# program that generates a randomized list of numbers from 0 through 99, and then does something with those numbers:

    namespace MyProgram
    {
        public class Program
        {
            static public void Main()
            {
                System.Collections.Generic.List<int> myList = new System.Collections.Generic.List<int>();
                System.Random rnd = new System.Random();
                for (int i = 0; i < 100; ++i)
                {
                    myList.Add(rnd.Next(100));
                }
                System.Collections.Generic.List<int> newList = RandomizeList(myList, rnd);

                // now do something with the randomized list
            }
            
            static private System.Collections.Generic.List<int> RandomizeList(
                System.Collections.Generic.List<int> theList,
                System.Random rnd)
            {
                System.Collections.Generic.List<int> newList = new System.Collections.Generic.List<int>(theList);
                for (int i = theList.Count-1; i > 0; --i)
                {
                    int r = rnd.Next(i+1);
                    int temp = newList[r];
                    newList[r] = newList[i];
                    newList[i] = temp;
                }
                return newList;
            }
        }
    }

I know that’s not the most efficient code, but runtime efficiency is not really the point here. Bear with me.

Now imagine if I were telling you a story about an experience I shared with my friends Joe Green and Bob Smith:

Yesterday, I went to Jose’s Mexican Restaurant with my friends Joe Green and Bob Smith. After the hostess seated us, Joe Green ordered a Mexican Martini, Bob Smith ordered a Margarita, and I ordered a Negra Modelo. For dinner, Joe Green had the enchilada special, Bob Smith had Jose’s Taco Platter . . .

Wouldn’t you get annoyed if, throughout the story, I kept referring to my friends by their full names? How about if I referred to the restaurant multiple times as “Jose’s Mexican Restaurant” rather than shortening it to “Jose’s” after the first mention?

The first sentence establishes context: who I was with and where we were. If I then referred to my friends as “Joe” and “Bob,” there would be no ambiguity. If I were to write 50 pages about our experience at the restaurant, nobody would get confused if I never mentioned my friends’ last names after the first sentence. There could be ambiguity if my friends were Joe Smith and Joe Green, but even then I could finesse it so that I didn’t always have to supply their full names.

Establishing context is a requirement for effective communication. But once established, there is no need to re-establish it if nothing changes. If there’s only one Joe, then I don’t have to keep telling you which Joe I’m referring to. Doing so interferes with communication because it introduces noise into the system, reducing the signal-to-noise ratio.

If you’re familiar with C#, most likely the first thing that jumps out at you in the code sample above is the excessive use of full namespace qualification: all those repetitions of System.Collections.Generic.List that clutter the code and make it more difficult to read and understand.

Fortunately, the C# using directive allows us to establish context, thereby reducing the amount of noise in the signal:

    using System;
    using System.Collections.Generic;
    namespace MyProgram
    {
        public class Program
        {
            static public void Main()
            {
                List<int> myList = new List<int>();
                Random rnd = new Random();
                for (int i = 0; i < 100; ++i)
                {
                    myList.Add(rnd.Next(100));
                }
                List<int> newList = RandomizeList(myList, rnd);

                // now do something with the randomized list
            }
            
            static private List<int> RandomizeList(
                List<int> theList,
                Random rnd)
            {
                List<int> newList = new List<int>(theList);
                for (int i = theList.Count-1; i > 0; --i)
                {
                    int r = rnd.Next(i+1);
                    int temp = newList[r];
                    newList[r] = newList[i];
                    newList[i] = temp;
                }
                return newList;
            }
        }
    }

That’s a whole easier to read because you don’t have to parse the full type name to find the part that’s important. Although it might not make much of a difference in this small program, it makes a huge difference when you’re looking at code that uses a lot of objects, all of whose type names begin with MyCompany.MyApplication.MySubsystem.MyArea.

The use of using is ubiquitous in C# code. Every significant Microsoft example uses it. Every bit of open source code I’ve ever seen uses it. Every C# project I’ve ever worked on uses it. I wouldn’t think of not using it. Nearly every C# programmer I’ve ever met, traded email with, or seen post on StackOverflow and other programming forums uses it. Even the very few who don’t generally use it, do bend the rules, usually when LINQ is involved, and for extension methods in general.

I find it curious that extension methods are excepted from this rule. I’ve seen extension methods cause a whole lot more programmer confusion than the using directive ever has. Eliminating most in-house created extension methods would actually improve the code I’ve seen in most C# development shops.

The arguments against employing the using directive mostly boil down to, “but I can’t look at a code snippet and know immediately what the fully qualified name is.” And I agree: somebody who is unfamiliar with the C# base class library or with the libraries being used in the current project will not have the context. But it’s pretty unlikely that a company will hire an inexperienced C# programmer for a mid-level job, and anybody the company hires will be unfamiliar with the project’s class layout. In either case, a new guy might find the full qualification useful for a week or two. After that, he’s going to be annoyed by all the extra typing, and having to wade through noise in order to find what he’s looking for.

As with having friends named Joe Green and Joe Smith, there is potential for ambiguity if you have identically named classes in separate namespaces. For example, you might have an Employee class in your business layer and an Employee class in your persistence layer. But if your code is written rationally, there will be very few source files in which you refer to both. And in those, you can either revert to fully-qualified type names, or you can use namespace aliases:

    using BusinessEmployee = MyCompany.MyProject.BusinessLogic.Employee;
    using PersistenceEmployee = MyCompany.MyProject.Persistence.Employee;

Neither is perfect, but either one is preferable to mandating full type name specification in the entire project.

The code example still has unnecessary redundancy in it that impedes understanding. Consider this declaration:

    List<int> myList = new List<int>();

That’s like saying, “This variable called ‘myList’ is a list of integers, and I’m going to initialize it as an empty list of integers.” You can say the same thing more succinctly:

    var myList = new List<int>();

That eliminates the redundancy without reducing understanding.

There is some minor debate in the community about using var (i.e. implicit typing) in other situations, such as:

    var newList = RandomizeList(myList, rnd);

Here, it’s not plainly obvious what newList is, because we don’t know what RandomizeList returns. The compiler knows, of course, so the code isn’t ambiguous, but we mere humans can’t see it immediately. But if you’re using Visual Studio or any other modern IDE, you can hover your mouse over the RandomizeList, and a tooltip will appear, showing you the return type. And if you’re not using a modern IDE to write your C# code, you have a whole host of other problems that are way more pressing than whether or not a quick glance at the code will reveal a function’s return type.

I’ve heard people say, “I use var whenever the type is obvious, or when it’s necessary.” The “necessary” part is a smokescreen. What they really mean is, “whenever I call a LINQ method.” That is:

    var oddNumbers = myList.Select(i => (i%2) != 0);

The truth is, they could easily have written:

    IEnumerable<int> oddNumbers = . . .

The only time var is absolutely necessary is when you’re working with anonymous types. For example, this code that creates a type that contains the index and the value:

    var oddNumbers = myList.Select((i, val) => new {Index = i, Value = val});

I used to be in the “only with LINQ” camp, by the way. But after a while I realized that most of the time the return type was perfectly obvious from how the result was used, and in the few times it wasn’t, a quick mouse hover revealed it. I’m now firmly in the “use implicit typing wherever it’s allowed” camp, and my coding has improved as a result. With the occasional exception of code snippets taken out of context, I’ve never encountered a case in which using type inference made code harder to understand.

If you find yourself working in one of the hopefully very few shops that restricts the use of these features, you should actively encourage debate and attempt to change the policy.

If you’re one of the hopefully very few people attempting to enforce such a policy (and I say “attempting” because I’ve yet to see such a policy successfully enforced), you should re-examine your reasoning. I think you’ll find that the improvements in code quality and programmer effectiveness that result from the use of these features far outweigh the rare minor inconveniences you encounter.

Categories

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.