Inheritance and IEquatable do not mix

Do not, under any circumstances, implement the IEquatable<T> interface on a non-sealed class.

I’m serious. If you do it, you will be sorry. You cannot make it work.

Think I’m crazy? People who know a lot more about this stuff than I do, agree.

Let me explain what they apparently knew some time back, and I just recently discovered. Here’s the short version:

A class that implements IEquatable<T> is saying, “I know how to compare two instances of type T or any type derived from T for equality.” After all, if type B derives from A, then B is-a A. That’s what inheritance means! The problem with implementing IEquatable<T> on a non-sealed type is that a base class cannot know how to compare instances of derived types. Trying to implement IEquatable<T> on a non-sealed class ends up producing inconsistent results when comparing instances of derived types.

Now the long version.

I assume here that you understand the difference between reference equality and value equality, know how to properly override the Object.Equals(Object) method on a class, and you’re at least passingly familiar with IEquatable<T>.

To review, you implement the IEquatable<T> generic interface on your class to provide a type-safe method for determining equality of instances. The type-safe Equals(T)method is called by certain methods of the generic collection classes, and is in general slightly faster (according to the documentation) than an overridden Object.Equals method. The documentation recommends that you implement the interface on any class that you expect will be stored in an array or a generic collection, “so that the object can be easily identified and manipulated.” It also recommends that if you implement IEquatable<T>, you should override the base class implementations of Equals(Object) and GetHashCode.

To implement IEquatable<T>, a class merely has to define a public Equals method that accepts a single parameter of type T, and returns a Boolean value indicating whether the object passed is considered equal. IEquatable<T>.Equals is subject to all of the same rules as Object.Equals(Object). It’s really just a type-safe version of that method, and in fact the documentation recommends that if you implement IEquatable<T>, then you should implement the overridden Object.Equals(object) method in terms of IEquatable<T>.Equals.

In the code below, I’ve created a base class, Foo, and a derived class, Fooby. Both override the Equals method as recommended, and both conditionally implement IEquatable<T>. The conditional implementation is there so that I can illustrate why implementing IEquatable<T> on non-sealed classes is a bad idea.

//#define UseIEquatableFoo
//#define UseIEquatableFooby

using System;

namespace EqualsFoo
{
    public class Foo
#if UseIEquatableFoo
        : IEquatable<Foo>
#endif
    {
        public readonly int Bar;

        public Foo(int bar)
        {
            Bar = bar;
        }
#if UseIEquatableFoo
        // IEquatable<Foo>.Equals
        public
#else
        private 
#endif
        bool Equals(Foo other)
        {
            // If the other is null, it can't be equal
            if (ReferenceEquals(other, null)) return false;

            // If they're the same object, then they're equal
            if (ReferenceEquals(this, other)) return true;

            // If they're not the same type, they can't be equal
            if (this.GetType() != other.GetType()) return false;

            // and do the comparison
            return Bar == other.Bar;
        }

        public override bool Equals(object obj)
        {
            return Equals(obj as Foo);
        }

        public override int GetHashCode()
        {
            return Bar;
        }
    }

    public class Fooby: Foo
#if UseIEquatableFooby
        , IEquatable<Fooby>
#endif
    {
        public readonly string Barby;

        public Fooby(int bar, string by)
            : base(bar)
        {
            Barby = by;
        }

#if UseIEquatableFooby
        // IEquatable<Fooby>.Equals
        public
#else
        private
#endif
        bool Equals(Fooby other)
        {
            // If the other is null, it can't be equal
            if (ReferenceEquals(other, null)) return false;

            // If they're the same object, then they're equal
            if (ReferenceEquals(this, other)) return true;

            // If they're not the same type, they can't be equal
            if (this.GetType() != other.GetType()) return false;

            // and do the comparison
            return base.Equals(other) && (Barby == other.Barby);
        }

        public override bool Equals(object obj)
        {
            return Equals(obj as Fooby);
        }

        public override int GetHashCode()
        {
            return base.GetHashCode() ^ Barby.GetHashCode();
        }
    }
}

The conditional code for the Equals(Foo) method looks a little weird, I’ll admit. I wrote it this way to eliminate some code duplication. If UseIEquatableFoo is defined, then Equals(Foo) is public, as required for interface implementations. Otherwise it’s private so that all equality checks go through the overridden Equals(Object) method. The Fooby.Equals(Fooby) method is similarly decorated.

Leave those two symbols undefined for now.

The rules for Object.Equals(Object) say that x.Equals(y) == y.Equals(x). Similarly, the rules for Object.Equals(Object, Object) say that Object.Equals(x, y) == Object.Equals(y, x). I can’t point you to anything that specifically says Object.Equals(x, y) == x.Equals(y), but it seems like a reasonable thing to expect.

Given the above, the following code should output false four times.

var fb1 = new Fooby(0, "hello");
var fb2 = new Fooby(0, "goodbye");

Console.WriteLine(fb1.Equals(fb2));
Console.WriteLine(fb2.Equals(fb1));
Console.WriteLine(Object.Equals(fb1, fb2));
Console.WriteLine(Object.Equals(fb2, fb1));

If you run that code, you’ll see that it does indeed work as expected.

Now implement IEquatable<Foo>. Uncomment the conditional symbol UseIEquatableFoo, and run the program again. The output might surprise you:

True
True
False
False

To borrow a line from one of my favorite movies: “What the schnitzel?”

The problem is overload resolution. When the compiler sees the expression, fb1.Equals(fb2), it has to decide what method call to generate code for. It has the choice of four methods:

  • Object.Equals(Object)
  • Foo.Equals(Object)
  • Fooby.Equals(Object)
  • Foo.Equals(Foo) (Implementation of IEnumerable<Foo>.Equals)

The overload resolution rules pick the “best” function to call based on the types of the arguments. The compiler decided that the best overload is the last one, Foo.Equals(Foo). After all, fb1 is-a Foo, and so is fb2. (If you’re interested in the exact rules used to make this determination, see the specification linked above.)

The two instances of Fooby are compared as though they are of type Foo. The Barby fields aren’t ever compared, so the result of the comparison is true. This doesn’t happen when we call Object.Equals(fb1, fb2), because Object.Equals(Object, Object) doesn’t know the static types. All it can do is call the virtual Object.Equals(Object) method on fb1. That call is resolved at runtime.

If fb1.Equals(fb2) and Object.Equals(fb1, fb2)don’t agree, you have a problem. Clients using your class will rightfully suspect your competence and the wisdom of continuing to trust anything else you’ve written.

If you’re not yet convinced that implementing IEquatable<T> on a non-sealed type is a bad idea, then your only solution to this problem is to make class Fooby implement IEquatable<Fooby>. That will give the compiler a better overload to bind. Just uncomment the UseIEquatableFooby conditional, and re-run the test. Once again, all is right with the world.

Or so you think. Let’s move the equality test to a separate method:

static void TestEquality(Foo f1, Foo f2)
{
    Console.WriteLine(f1.Equals(f2));
    Console.WriteLine(f2.Equals(f1));
    Console.WriteLine(Object.Equals(f1, f2));
    Console.WriteLine(Object.Equals(f2, f1));
}

And add a call to that method after the original tests:

var fb1 = new Fooby(0, "hello");
var fb2 = new Fooby(0, "goodbye");

Console.WriteLine(fb1.Equals(fb2));
Console.WriteLine(fb2.Equals(fb1));
Console.WriteLine(Object.Equals(fb1, fb2));
Console.WriteLine(Object.Equals(fb2, fb1));

TestEquality(fb1, fb2);

Now what happens when you run the test?

False
False
False
False
True
True
False
False

The problem, once again, is overload resolution. But there’s nothing you can do about it this time.

Implementing IEquatable<T> makes the definition of equality dependent on the compile time static type. In this particular case, the compiler sees that the f1 and f2 parameters in the TestEquality method have a static type Foo, so it generates code to call Foo.Equals(Foo), even though their runtime types are Fooby.

The problem occurs in collections, too, although it’s more difficult to show. To illustrate the problem, I have to generate two Fooby instances that produce the same hash code. The code below generates 8-character hexadecimal strings until it finds two that produce the same hash code. It then creates two Fooby instances and tries to add them to a HashSet<Foo>.

// Find two strings that have identical hash codes
string clash1 = null;
string clash2 = null;

var d = new Dictionary<int, string>();
for (var i = 0; i < int.MaxValue; ++i)
{
    string s = i.ToString("X8");
    int hash = s.GetHashCode();
    string val;
    if (d.TryGetValue(hash, out val))
    {
        Console.WriteLine("string '{0}' clashes with '{1}'", s, val);
        clash1 = s;
        clash2 = val;
        break;
    }
    d.Add(hash, s);
}

// Add the two items to a HashSet<Foo>
var foos = new HashSet<Foo>();
foos.Add(new Fooby(0, clash1));
foos.Add(new Fooby(0, clash2));
Console.WriteLine("foos.Count={0}", foos.Count);

On my system, it tells me, "string '000B020A' clashes with '00070101'".

If I run the program without implementing IEquatable<T> on the classes, foos.Count returns 2, which is correct. If I enable implementation of IEquatable<T>, foos.Count returns 1. It thinks that Fooby(0, "000B020A") is equal to Fooby(0, "00070101"), which obviously is not true. You can take issue with my GetHashCode implementation if you like, but the nature of hash codes is that you’re mapping an essentially infinite number of objects (how many possible strings are there?) onto a code that can hold a finite (2^32) number of unique values. No matter how good your hash code generation is, there will be duplicates!

There is no way to make this work correctly. If you implement IEquatable<T> on a non-sealed class, equality comparisons of derived types will produce inconsistent results. Don’t do it!

Most likely, the behavior you’re looking for is properly implemented by deriving a class from EqualityComparer<T>. I’ll give you a closer look at that next time.

1 comment to Inheritance and IEquatable do not mix

Categories

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.