Why you shouldn’t use the .NET sort

My friend and business partner David Stafford recently posted a blog entry, .Net’s Sort Is Not Secure. Don’t Use It. Here’s a Better One, in which he shows that the .NET sort implementation (used by Array.Sort and List.Sort, and possibly others) can easily be made to exhibit pathological behavior.

How bad is it? You can construct an array of one million items that will take the .NET sort implementation more than 80 minutes to sort. The average case is something like half a second.

David’s contention is that this is a security vulnerability. Others might disagree with him, but that’s their choice. David is right. It’s not a great stretch to imagine a Web site that, as part of its work, accepts a data file from a user and sorts that data. A malicious user, knowing that the server is using .NET, could construct a data file that causes the sort to exhibit this pathological behavior, causing the site to become unresponsive. This is nothing short of a denial of service attack, made possible by the poor sorting implementation. As David shows in his post, it’s not terribly difficult to construct a worst-case array.

That fits very comfortably within the definition of a security vulnerability.

David makes two other assertions: that the sort is inflexible, and that the sort is slower than it should be, even in the absence of a malicious adversary.

The sort is somewhat flexible in that it lets you supply a comparison delegate. It does not, however, let you supply a swap delegate. That’s okay in many cases. However, if you’re sorting large structures (value types), or if you want to do an indirect sort (often referred to as a tag sort), a swap delegate is a very useful thing to have. The LINQ to Objects sorting algorithm, for example uses a tag sort internally. You can verify that by examining the source, which is available in the .NET Reference Source. Letting you pass a swap delegate would make the thing much more flexible.

David’s tests show that the .NET sort implementation is slower than it could be. In my opinion, it’s slower than it should be. David’s implementation is faster than the .NET sort in the general case, and doesn’t exhibit pathological behavior in the worst case. The worst case is so terrible, in fact, and so easy to provoke, that the .NET sort should be rewritten.

And yet, the .NET team has refused to address this issue. At best, that’s irresponsible. One can only hope that enough users log in and vote to have the issue addressed, forcing the .NET team to reconsider their decision.

David also noted that LINQ sorting is also vulnerable. What he didn’t point out is that LINQ to Objects uses a completely different algorithm than does Array.Sort. The LINQ to Objects sort is a standard naive Quicksort implementation. As you can see from his timings, the LINQ sort is 50% slower than the already tortoise-like Array.Sort in the face of an adversary.

Understand, the .NET sort will be faster than David’s Introsort in the general case if you’re sorting a primitive type (intdouble, etc.) and you’re not supplying a comparison delegate. The .NET sort is faster in that case because it has special-case code to sort primitive types. If David took the time to make special-case additions to his sorting algorithm, it would outperform the .NET sort in those cases, as well.

Of course, even the special-case sorts in the .NET runtime are vulnerable in the face of an array constructed to provoke the worst case.

So take David’s advice: don’t rely on the .NET sort. Download his code and use it.

I’m considering putting together something similar to replace the LINQ to Objects sort. The general idea is to create a class called SafeOrderedEnumerable that implements IOrderedEnumerable, and uses David’s Introsort in the CreateOrderedEnumerable method. To invoke it, I’ll create extension methods SafeOrderBy and SafeOrderByDescending so that you can write, for example:

var sorted = myList.SafeOrderBy(x => x);

That should put LINQ to Objects sorting in the same ballpark as sorting an array. Not the same, of course, but close, and it will avoid the potential pathological cases.