We had an interesting discussion at the office about how much validation a collection type should do in its constructor. The key question, I think, came down to this:
If the constructor can determine that using the instantiated object will throw an exception, should the constructor fail rather than returning the instantiated object?
In other words, if I know that the instantiated object won’t work, shouldn’t I just throw the exception now, rather than let you be surprised later?
There are two extremes here: 1) the constructor should go to heroic efforts, and; 2) let the buyer beware. I tend to lean towards putting the onus on the caller, figuring that whoever is instantiating the object knows what he’s doing. Let me provide an example.
Consider the .NET SortedList
generic collection type. To do its job (that is, keep a collection of items sorted), it requires a comparison function. If you don’t specify a comparison function when you call the constructor, the collection uses the default comparison function for whatever type you specify as the key. This sounds simple enough, right? A list of employees that’s sorted by employee number, for example, would be defined like this:
SortedList<int, Employee> Employees =
new SortedList<int, Employee>();
Because the System.Int32
type (which the C# int type resolves to) implements IComparable
, everything works.
But imagine you have an EmployeeNumber
type:
class EmployeeNumber
{
public string Division { get; private set; }
public int EmpNo { get; private set; }
public EmployeeNumber(string d, int no)
{
Division = d;
EmpNo = no;
}
}
Now, if you create a SortedList
that’s keyed on that type, you’ll have:
SortedList<EmployeeNumber, Employee> Employees =
new SortedList<EmployeeNumber, Employee>();
Allow me to show the entire program here, so we don’t get confused.
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace genericsTest
{
class EmployeeNumber
{
public string Division { get; private set; }
public int EmpNo { get; private set; }
public EmployeeNumber(string d, int no)
{
Division = d;
EmpNo = no;
}
}
class Employee
{
public string Name { get; set; }
public Employee(string nm)
{
Name = nm;
}
}
class Program
{
static SortedList<EmployeeNumber, Employee> Employees =
new SortedList<EmployeeNumber, Employee>();
static void Main(string[] args)
{
Employees.Add(new EmployeeNumber("Accounting", 1),
new Employee("Sue"));
Employees.Add(new EmployeeNumber("Dev", 2),
new Employee("Jim"));
}
}
}
If you compile and run that program, you’ll see that it throws an exception when it tries to add the second employee to the list. The program fails because it can’t compare the items. Neither implements IComparable
.
Those who lean towards the first extreme above will argue that the SortedList
constructor should determine that the key type doesn’t implement IComparable
, and should prevent you from instantiating the collection. It should throw an exception because it knows that trying to add items to the collection will fail.
The constructor could do this. It’s possible for the constructor to get the default comparer and call it. If the comparison function returns a value, then all is well. If it fails, then the constructor throws an exception saying, “Sorry, but you didn’t supply a comparison function.”
The only problem with that scenario is that it’s wrong. Not wrong philosophically, but wrong in a very concrete sense. Extending my example will illustrate why.
Suppose you have two different types of employee numbers. Maybe an OldEmployeeNumber
that looks like the one I defined above, and a NewEmployeeNumber
that has different fields. Because you want to keep both employee number types in the same list, you define a base class, EmployeeNumberBase
from which they both can inherit. The definitions would look like this:
abstract class EmployeeNumber : IComparable
{
// Some common employee number functionality goes here.
public int CompareTo(object obj)
{
throw new NotImplementedException();
}
}
class OldEmployeeNumber : EmployeeNumber, IComparable
{
public string Division { get; private set; }
public int EmpNo { get; private set; }
public OldEmployeeNumber(string d, int no)
{
Division = d;
EmpNo = no;
}
int IComparable.CompareTo(object obj)
{
int rslt = 0;
if (obj is OldEmployeeNumber)
{
var o2 = obj as OldEmployeeNumber;
rslt = Division.CompareTo(o2.Division);
if (rslt == 0)
rslt = EmpNo.CompareTo(o2.EmpNo);
}
else if (obj is NewEmployeeNumber)
{
// OldEmployeeNumber sorts before NewEmployeeNumber
rslt = -1;
}
return rslt;
}
}
class NewEmployeeNumber : EmployeeNumber, IComparable
{
public string Country { get; private set; }
public decimal EmpNo { get; private set; }
public NewEmployeeNumber(string c, decimal no)
{
Country = c;
EmpNo = no;
}
int IComparable.CompareTo(object obj)
{
int rslt = 0;
if (obj is NewEmployeeNumber)
{
var o2 = obj as NewEmployeeNumber;
rslt = Country.CompareTo(o2.Country);
if (rslt == 0)
rslt = EmpNo.CompareTo(o2.EmpNo);
}
else if (obj is OldEmployeeNumber)
{
// NewEmployeeNumber sorts after OldEmployeeNumber
rslt = 1;
}
return rslt;
}
}
Yeah, I know. That’s quite a mouthful.
The EmployeeNumberBase
class implements the IComparable
interface, but its implementation just throws NotImplementedException
. Furthermore, the class is marked as abstract
to prevent it from being instantiated. Only derived classes can be instantiated.
The derived classes each explicitly implement the IComparable
interface. The company-defined sorting rules are that old employee numbers always sort in the list before new employee numbers. Within the same type, the numbers are sorted using their own rules. [Note here that my CompareTo
implementations aren’t terribly robust. They’ll return zero (equal) if the object passed is not of a known type, and they’ll fail if the passed object is null. But those details aren’t terribly relevant to the example.]
Now, the Employees
list is created in exactly the same way:
SortedList<EmployeeNumber, Employee> Employees =
new SortedList<EmployeeNumber, Employee>();
We can then add items to the list:
Employees.Add(new NewEmployeeNumber("USA", 2.002m),
new Employee("Jim"));
Employees.Add(new OldEmployeeNumber("Accounting", 1),
new Employee("Sue"));
Employees.Add(new OldEmployeeNumber("HR", 3),
new Employee("Dana"));
If you make those changes and run the program, you’ll see that it does indeed run, and work as expected, and I didn’t change the comparison function that the constructor sees.
If SortedList
had attempted to protect me from myself–that is, call the default comparison function and throw an exception because the comparison function had failed–then this final code would not work. By trying to protect me from myself, it would have prevented me from doing what I wanted to do.
Understand, the above is something of a contrived example. I certainly can’t imagine implementing the employee list that way, even if I did have different employee number types. But somebody else might think it’s a perfectly reasonable thing to do. The point is that there could be very good reasons for instantiating a keyed collection with a key type that does not have a valid comparison function. The constructor cannot know if comparisons will fail.
Which brings us back to the original question: how hard should a collection class (or any library object) try to prevent you from instantiating an object that will fail? In my opinion, the constructor should instantiate the object if the immediate parameters look reasonable. My reasoning is that it’s extremely difficult, if not impossible, to know how the caller will be using the class. As you saw above, making broad assumptions about types in a polymorphic environment can be fatal.
This reasoning extends far beyond the question of how a collection class’s constructor should behave. As programmers, we have to strike a balance between paranoia and productivity. We have to decide daily how much trust to put in the code that calls our methods, and how much we can depend on the code we call. Do we write classes that hold the programmer’s hand to help him across the street, or do we provide a “walk” signal and a warning that says, in effect, “If you cross on red, all bets are off”?