Making equality comparisons work

One of the benefits of inheritance is that you can treat a derived object as though it is an object of the inherited class: a Fooby is-a Foo. But as you saw in my previous post, treating a Fooby as a Foo when you’re comparing two objects for equality can result in inconsistencies.

This is a problem because there are times when you really do want a collection of objects that are derived from class Foo, but limit that collection so that no two objects, regardless of their type, have identical Bar values. For example, you might want the code below to keep only the first and third items.

var foos = new HashSet<Foo>();
foos.Add(new Foo(0));
foos.Add(new Fooby(0, "Hello"));
foos.Add(new Fooby(1, "Jim"));
foos.Add(new Foo(1));
Console.WriteLine(foos.Count);

If you run that code, you’ll see that it outputs ‘4’, meaning that all four items were added to the collection.

Many programmers will attempt to solve this problem by implementing IEquatable<T> on the classes in question. My previous article showed why that approach can’t work.

There’s simply no good way to supply default behavior that works well in all circumstances. The solution is to stop trying to make the classes understand that you’re changing the meaning of equality. If you want to compare all Foo-derived instances as though they’re actually Foo instances, then supply a new equality comparer that does exactly what you want.

The easiest way to create a new equality comparer is to derive from the generic EqualityComparer<T> class, like this:

public class FooEqualityComparer : EqualityComparer<Foo>
{
    public override bool Equals(Foo x, Foo y)
    {
        if (ReferenceEquals(x, y)) return true;
        if (ReferenceEquals(x, null) || ReferenceEquals(y, null)) return false;
        return x.Bar == y.Bar;
    }

    public override int GetHashCode(Foo obj)
    {
        if (ReferenceEquals(obj, null)) return -1;
        return obj.Bar.GetHashCode();
    }
}

EqualityComparer<T> is a base class implementation of the IEqualityComparer<T> generic interface. You could create your own class that implements IEqualityComparer<T> directly, but the documentation recommends deriving a class from EqualityComparer<T>.

By now you should be familiar with the two methods that IEqualityComparer<T> requires: Equals and GetHashCode. Implementations of these two methods are subject to all of the rules I mentioned in my previous article, including those rules about handling null values.

We can then change the example code above so that the collection uses an instance of this class to do its equality comparisons, rather than depending on whatever possibly incorrect default behavior exists in the classes themselves.

var foos = new HashSet<Foo>(new FooEqualityComparer());
foos.Add(new Foo(0));
foos.Add(new Fooby(0, "Hello"));
foos.Add(new Fooby(1, "Jim"));
foos.Add(new Foo(1));
Console.WriteLine(foos.Count);

Now, when the collection wants to compute a hash code, it always calls the GetHashCode method in the supplied equality comparer. And when it needs to compare two items, it always calls that Equals method. The collection is no longer at the mercy of objects’ individual GetHashCode and Equals methods.

You don’t need to limit this to collections, by the way. You can create an equality comparer and call its Equals method directly:

var myEquals = new FooEqualityComparer();
var f1 = new Foo(0);
var fb1 = new Fooby(0, "Hello");
var areEqual = myEquals.Equal(f1, fb1);

By supplying your own equality comparer, you control exactly how hash codes are computed and how items are compared. You are no longer hampered by the default meaning of equality as implemented by the individual types.

It’s interesting to note that much of what I’ve said about equality comparisons in this and the previous post applies equally to comparisons in general (is object a less than, equal to, or greater than object b). Perhaps I’ll talk about that soon.

Jim's Random Notes

Random notes about random stuff

Making equality comparisons work