An Assumption of Competence

My second programming job was with a small commercial bank in Fresno, CA, where I helped maintain the COBOL account processing software.  I was still pretty inexperienced, having only been working in the industry for about 18 months.  My previous job involved maintaining software, also in COBOL, for small banks in western Colorado.

One of the first things my new boss asked me to look at was a program that computed loan totals by category:  a two-digit code that was assigned to each loan.  Federal regulations required that we report the total number and dollar amount of all loans, by category, as well as the number and dollar amount that were 30, 60, and 90 days or more past due.  The problem was that the report program was taking way too long to run.  The bank had recently acquired a bunch of new loans, and the program’s run time had increased sharply—several times more than what one would expect from the increase in the number of loans.

Understand, this was a very simple program.  All it had to do was go through the loans sequentially and compute totals in four columns (total, 30 days past, 60 days past, 90+ days past) for each of the 100 categories.  The data structures are very simple.  I don’t remember enough COBOL to write it intelligently, so I’m showing them translated to C#:

struct CountAndTotal
{
    public int Category;
    public int Count;
    public double Total;
}

// Arrays for Total, 30, 60, and 90+ days past due
CountAndTotal[] TotalAllLoans = new CountAndTotal[100];
CountAndTotal[] Total30Past = new CountAndTotal[100];
CountAndTotal[] Total60Past = new CountAndTotal[100];
CountAndTotal[] Total90Past = new CountAndTotal[100];

I’ll admit that I was a little mystified by the Category field in the CountAndTotal structure, but figured it was an artifact from early debugging.

Those definitions and the description of the problem above lead to a simple loop:  for each loan, determine its category and payment status, and add the totals to the proper items in the arrays. The program almost writes itself:

while (!LoanFile.Eof)
{
    LoanRec loan = LoanFile.ReadNext();
    AddToTotals(loan, TotalAllLoans);
    if (loan.PastDue(90))
        AddToTotals(loan, Total90Past);
    else if (loan.PastDue(60))
        AddToTotals(loan, Total60Past);
    else if (loan.PastDue(30))
        AddToTotals(loan, Total30Past);
}

What surprised me when I looked at the code was the implementation of the AddToTotals function. You would expect it to be a simple index into the array from the loan’s category code. After all, the category was guaranteed to be in the range 0-99. That just begs for this implementation:

void AddToTotals(LoanRec loan, CountAndTotal[] Totals)
{
    ++CountAndTotal[loan.Category].Count;
    CountAndTotal[loan.Category].Total += loan.Balance;
}

What I found was quite surprising. Rather than directly index into the array of categories, the program would do a sequential search of the array to see if that category was already there. If it was, the total was added. Otherwise the program made a new entry at the next empty spot in the array. That explained the mysterious Category field and the absurdly long run time. The code is much more complicated:

void AddtoTotals(LoanRec loan, CountAndTotal[] Totals)
{
    int i = 0;
    while (i < 100)
    {
        if (Totals[i].Category == loan.Category)
        {
            ++Totals[i].Count;
            Totals[i].Total += loan.Balance;
            break;
        }
        else if (Totals[i].Category == -1)
        {
            // Unused position.  The category wasn't found in the array.
            Totals[i].Category = loan.Category.
            Totals[i].Count = 1;
            Totals[i].Total = loan.Balance;
            break;
        }
        ++i;
    }
}

The difference in run times is enormous! The first implementation accesses the array directly from the loan.Category field. The second has to search the array sequentially—an operation that involves, on average, looking at 50 different items every time.  The second version of the program is 50 times slower than the first.  In addition, the second required a subsequent sort to put things in the proper order before printing the results.

Being new at the job, I went to my boss, explained what I’d found, and said, “What am I missing?”  His response:  “Why do you think you’re missing something?”

He went on to explain that my analysis was correct, and that the industry (at least back then) was full of programmers who had no business sitting at a terminal.  It was something of a revelation to me, because I had assumed that the people who wrote this stuff really knew what they were doing.  It also taught me to question everything when faced with a problem.  It’s always a good idea to assume competence when you start debugging somebody else’s (or your own) code, but when things stop making sense, it’s time to re-evaluate that assumption.

More Whittling

After I whittled that knife a couple of weeks ago, I tried to make a small decorative spoon. I made two mistakes on that project: 1) I selected the wrong kind of wood and; 2) I used the wrong knife. I took a branch that I’d cut from the pear tree a few months ago and started whittling on it. I noticed immediately that the pear wood is much harder than the juniper I’d whittled the knife from. I found out later that it’s about four times as hard. No wonder I had trouble with it.

The knife I selected is a cheap pocket knife (I was going to say “utility knife,” but that describes a particular kind of knife) that I’d been carrying around for a few months. Like the Buck 112 “hunting” knife I used previously, this one is just too large for detail work. It works fine for day-to-day box opening and such. As a carving tool it leaves a lot to be desired, in large part because the handle is so thin.

Anyway, here’s a picture of the spoon and the knife. By the time I got the basic shape of the thing roughed out, I was so frustrated that I just wanted to call it done.

I’m not particularly proud of the way the spoon turned out, but I certainly learned a lot making it. The pear is a beautiful wood, but I don’t yet have the skill to work with it. I’ve put a few branches up in the rafters of the garage while I work on my technique.

I picked up a small-ish Buck pocket knife and visited the local WoodCraft store to pick up a carving glove, a thumb protector (the spoon cost me two cuts), and a small box of basswood blocks. Then I searched online for a simple project and came across instructions to carve a pinecone tree ornament. It’s a great beginner’s project. I’m sure it took me an absurdly long time to complete the project (I spread it out over about 10 days). I obviously have a lot to learn, but I’m pretty happy with the result:

I had planned to paint them and add some “snow” at the top, but Debra says she’d like to keep them raw. Who am I to argue?

Useful Notepad Feature

Somebody pointed out a useful feature of Windows Notepad:  date/time stamp logging.  Imagine you want to keep a diary of sorts in a Notepad file, and have a date/time stamp at each “entry”.  Normally, you’d open the file, press Ctrl+End to get to the end of the file, enter the date and time manually, and then start typing.

You can get Notepad to do that for you, automatically.  Here’s how.

  1. Open a blank document in Notepad and type “.LOG” (without the quotes).
  2. Save the file as diary.txt.
  3. Close Notepad.
  4. Now start Notepad again and open the file that you just saved.

Notepad adds a blank line at the end of the file, enters the time and date, and positions to the end of the file so that you can start typing your notes.  I guess it really is a “notepad” program.

Some people laugh, but I find Notepad to be incredibly useful for writing quick design notes and thoughts.  Sure, it’s a primitive tool.  But I don’t need anything fancy at that stage.  I need something that will open quickly and let me enter text with a minimum of fuss.  I’ve not found anything better for that than Notepad.

Interface Annoyances

We ran into a rather difficult class design problem recently that reveals a shortcoming in C# and, apparently, the .NET runtime (specifically, the Common Language Infrastructure, or CLI). It’s a pretty common problem, and I’m a little bit surprised it hasn’t been addressed.

As you know, C# doesn’t allow multiple inheritance. It does, however, allow classes to implement interfaces, which is kind of like inheriting behaviors. The big difference is that with interfaces you have to supply the implementation in your class. With inheritance, you get a default implementation along with the interface.

And before you flame me, please understand that I’m well aware of the many differences between multiple inheritance and implementing interfaces, including the many assumptions that come along with multiple inheritance. The fact remains, though, that the general behavior of a class that implements an interface will be very similar to the behavior of a class that inherits the behaviors defined by that interface. The implementations are quite different, of course, and inheritance carries with it some often unacceptable baggage, but from the outside looking in, things look very much the same.

Interfaces are very handy things, but they can get unwieldy. Consider, for example, an interface called ITextOutput that contains 4 methods:

interface ITextOutput
{
    void Write(string s);
    void Write(string fmt, params object[] parms);
    void WriteLine(string s);
    void WriteLine(string fmt, params object[] parms);
}

The idea here is that you want to give classes the ability to output text strings. If a class implements ITextOutput, then clients can call the Write or WriteLine methods, and the supplied string will go to the object’s output device. So, a class called Foo might implement the interface:

class Foo : ITextOutput
{
    public void Write(string s)
    {
        Console.WriteLine(s);
    }

    public void Write(string fmt, params object[] parms)
    {
        Write(string.Format(fmt, parms));
    }

    public void WriteLine(string s)
    {
        Write(s + Environment.NewLine);
    }

    public void WriteLine(string fmt, params object[] parms)
    {
        WriteLine(string.Format(fmt, parms));
    }
}

Easy enough, right? Except that every class that implements ITextOutput has to implement all four methods. If you look closely, you’ll see that the last three methods all end up calling the first after formatting their output. In most cases, the only method that will change across different classes that implement this interface will be the first Write method. You might want to change the output device, for example, or include a date and time stamp on the output line.

C# does not provide a good solution to this problem. As I see it, you have the following options:

  1. Implement the methods as shown in each class that implements ITextOutput. This is going to be tedious and fraught with error. Somebody is going to make a mistake in all that boilerplate code and the resulting bug will appear at the worst possible time—quite possibly after the product has shipped.
  2. Structure your class hierarchy so that every object inherits from a common TextOutputter class that implements the interface. This is a very good solution if you can do it. For those classes that inherit from some other base class, you can implement the interface as shown above. I have difficulty with this solution because it’s saying, in effect, “Foo IS-A TextOutputter that also does other stuff.” In reality what we really want to say is, “Foo IS-A object (or some other base class) that implements the ITextOutput functionality. It might sound like a fine distinction, but it matters. A lot. Especially when refactoring code.
  3. Forget about inheritance and interfaces and make Foo contain a member that implements the interface. Something like:
    class Foo
    {
        public ITextOutput Outputter = new TextOutputter();
    }

    This will work, but it’s annoying to clients. Rather than calling Foo.Write, for example, they have to call Foo.Outputter.Write. And class designers are free to change the name of the Outputter member to anything. The result is that clients can’t tell by looking at the class declaration if it implements the ITextOutput interface. Instead, they have to go looking for a member variable (or property) that implements it.

Any way you look at it, it’s messy. As a client, I’d expect class designers to bite the bullet and go with the first option, and test thoroughly. In truth, I think that’s the only reasonable option, painful as it is. As a designer, I’d grumble about the need for all that extra typing, but I’d do it. I’d be embarrassed to release a class that implemented either of the other solutions because as a user I’d be annoyed by either of the other implementations. But I sure wish there were another way to do it.

Delphi solved this problem by using what’s called implementation by delegation. The technique involves creating a member that implements the interface (similar to the third option shown above), and delegating calls to interface methods to that object. In C#, if such a feature existed, the syntax might look something like this:

class Foo: ITextOutput
{
    public ITextOutput Outputter =
        new TextOutputter() implements ITextOutput;
}

Clients could then call the Write method on an object of type Foo, just as they would with the first option. The runtime (or the compiler, maybe) would then delegate such interface calls to the Outputter member. We have the best of both worlds: real interfaces, and we don’t have to repeatedly type all that boilerplate code.

I’m not the first one to run into this problem or to suggest the solution. Steve Teixeira mentioned it in his blog over three years ago, and linked to this blog entry from 2003. Steve, by the way, is the one who came up with the idea for Delphi. He says that it’s not currently possible to do such a thing in .NET because it can’t be made verifiably type-safe. I don’t understand why not, but I’ll defer to his judgement here.

This type of thing is trivial to implement in languages that support multiple inheritance. But I don’t think I’m willing to accept the problems with multiple inheritance in order to get this one benefit. It’s a moot point anyway, as it’s quite unlikely that .NET will support multiple inheritance any time soon.

I’d sure be interested to find out how others handle this situation in C# or other .NET languages. Drop me a line and let me know.

Categories

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.