Good riddance to rubbish writing

In Gen Z never learned cursive. The effects of this are more widespread than you think (which references an October 2022 Atlantic article, Gen Z Never Learned To Read Cursive), the author describes the potential effects of discontinuing cursive writing instruction. In truth, the first article mentioned just expands a bit on one or two of the points made in the Atlantic article.

A note before I continue. Cursive is actually any form of connected writing. What most Americans refer to as “cursive” is the Palmer Method script that was introduced in the early 20th century, that most of us learned to fear and loathe, and many of us (myself included) can’t write legibly today. In the text below, I use “cursive” to mean that Palmer Method and its variants.

The Palmer Method, by the way, is a refinement of the Spencerian Method of writing that was introduced in the 1840s. Both are teaching a method of writing that’s optimized for 17th century pens. A primary consideration is keeping the pen on the paper because every time the pen was lifted and then placed back onto the paper, the ink tended to blot. These methods go to sometimes absurd lengths to keep the pen on the paper, even when doing so is less than optimum. The introduction of the inexpensive mass-produced ball point pen eliminated the ink blotting problem and thus the necessity to always keep the pen on the paper. But the script lives on.

The big complaint about no longer teaching cursive is that “the past is presented to us indirectly.” That is, because historical documents are written in script that students aren’t taught to read and write, they have to depend on transcription if they want to read the documents. There’s also some complaint that kids won’t be able to read letters from grandma.

And it’s true: without instruction and practice, cursive writing is difficult or impossible to read. Fortunately, it takes about a week to learn how to read it. Maybe a bit of practice after that, but if a child knows how to read block printing, learning to read the cursive script that’s been the mainstay of elementary writing instruction for a century is very easy.

In addition, the cursive script that’s been taught since the early 1900s isn’t even the same script that was used in our founding documents. The Declaration of Independence, perhaps the most oft-cited historical document cited in this argument, was written in Round hand, a script that originated in England in the 17th century. Learning to read modern cursive writing certainly helps in reading that document, but it still requires a bit more puzzling out.

Point is, the important historical documents all have been rendered in a typewritten font. There’s little or nothing to be gained for most people in reading the originals. There’s the incredibly weak argument that scholars will have to be taught cursive like they’re taught Elizabethan, Medieval, or ancient Cuneiform script. My response is, “duh.” Digging deeply into the past requires specialized skills. The ability to read cursive today is just a little bit more important than knowing how to hitch a horse to a buggy.

The “past is presented indirectly” argument for learning cursive writing just doesn’t fly. It’s as relevant to day-to-day life as the idiotic argument that those who can’t drive a car with manual transmission are somehow not “real” drivers. Ranks right up there with the, “when I was a boy, we had to trudge through three feet of snow, in the dark, uphill both ways to use the bathroom” stories we laugh about.

To be fair, there are benefits of learning cursive. First, it’s much faster than block printing, and there is that ability to read letters from grandma. It helps in developing fine motor skills, and studies show that students with dyslexia find learning cursive helps with the decoding process. However, people with dysgraphia are hindered by being forced to write in cursive.

Although I haven’t had to depend on my ability to write in cursive for at least 40 years, I do agree that many people must be able to write legibly and more quickly than they can with just block printing. But we should be teaching kids a different method. There are other connected writing systems that are easier to learn, faster, and easier to write legibly than the antiquated loopy Palmer Method that was designed 100 years ago for writing with a quill pen: a technology that was obsolete before the Palmer Method was introduced. (The ball point pen was invented in the 1880s.) For example, Getty-Dubay Italic, introduced in 1976, is much easier to read and write, and doesn’t require the loops and other forms that were necessary in older scripts to prevent the ink from blotting. There are other, similar, writing styles that are optimized for today’s writing instruments.

Whether those newer methods carry with them the benefits of developing fine motor skills and assistance to dyslexic students is an open question. I think it likely. I also suspect that it would have less of a negative impact on those with dysgraphia because the letter forms are so similar to the printed letter forms. As for letters from grandma, that’s becoming irrelevant, too. At 62, I’m older than a lot of grandmas and I’ll tell you from experience that a lot of them write cursive as well and as often as I do: illegibly and almost never. Letters from grandma that are written in cursive will pretty much cease to exist in my lifetime.

It’s true that trying to write those modern scripts using a 17th century quill pen would be as disastrous as taking a horse and buggy on an Interstate highway. Sometimes you have to discard old things and embrace the unquestionable benefits of new technology. Unless you like having to check for black widow spiders after trudging through the snow to the outhouse.

Shingles is no fun

I am suffering through my first experience with shingles. No, not the roofing kind. Shingles is a disease characterized by a painful skin rash with blisters that occurs in a localized area. It’s caused by the chickenpox virus. If you’ve had chickenpox, then likely the virus is lying dormant in your nerve cells, where it will come back to haunt you when you’re older.

It’s … unpleasant. And from what information I’ve been able to gather, what I’m currently experiencing is a rather mild case.

According to the Wikipedia article, about one-third of all people will experience at least one attack of shingles in their lifetime. Shingles is more common among older people. If you live to 85 years old, your chance of having at least one attack is about 50%. The article says that fewer than 5% will have more than one attack.

Risk of death from shingles is very low: between 0.28 and 0.69 deaths per million. Combined with the apparently low likelihood of a second attack, some would say that I’m being overly cautious in contacting my doctor to get the vaccine. My response to them is, “tell me that again after your first experience.”

If you’re over 50, talk to your doctor about getting the shot. The vaccine doesn’t guarantee that you won’t have an attack, but studies show that the likelihood is reduced something like 90% over three or four years, and if an attack does occur it will be less severe. Worth repeated trips to the doctor, in my opinion. My “mild case” was quite uncomfortable for a couple of days and it’s going to itch like crazy (like poison ivy) for the next week or so. I’ve known people who’ve had it much worse. Believe me when I tell you that you do not want to experience it.

Starting over

It’s been several years since I actively maintained this blog. I went through a bad time, mentally, prior to and just after the COVID pandemic during which I inadvertently lost much of what I had posted over the years. The text of the blog entries is in a MySQL database file that I can probably read if I put some effort into it. The pictures are gone.

That said, I’m finally coming out of my mental haze, and trying to get back on a more even keel. Time to re-start the blog.

As in the past, I’ll write about whatever is on my mind. Computer programming and wood working (carving, mostly) are probably my most frequent areas of interest, but there’s no telling what I might delve into.

Also note that the format here is likely to change in the near future. I just picked a basic theme to get started. Fiddling with formatting is not high on my list of favorite activities, so it might be a while.

Recovering

One casualty of the 2020 debacle has been my blog. Through a series of unfortunate events, including my deteriorating mental state since the beginning of the lockdown in March 2020, I lost access to my blog and I just didn’t want to mess with it. I still have text of the entries in a database, but it’s in an old format and I haven’t yet looked at how I’m going to import the data. I probably have the images on a backup disk somewhere. It’ll just take time to find and upload them.

For now, I’m just trying to get online again. This is definitely an ugly theme, but it’s a start. My hope is to get this going again in January.

A different way to merge multiple lists

In my previous post, I said that this time I would show full-featured MergeBy and MergeByDescending methods. Before I do that, I want to show an alternate method for merging multiple sorted lists.

If you recall, I said that the generalized k-way merge algorithm is:

    Initialize list indexes
    while any list has items
        Determine which list has the smallest current item
        Output the item from that list
        Increment that list's current index

As I illustrated last time, that does indeed work. But there is another way to do it that is, asymptotically at least, just as fast. That is, it operates in O(n log2 k) time.

Imagine you have four sorted lists, named A, B, C, and D. Another way to merge them is this way:

    Merge A and B to create a list called AB.
    Merge C and D to create a list called CD.
    Merge AB and CD to create the final list.

To estimate the time complexity, let’s assume that each of the four lists contains the same number of items. So if the total number of items is n, then each list’s length is n/4.

Merging A and B, then, takes time proportional to n/4 + n/4, or n/2. Merging C and D also requires n/2 time. The final merge of AB and CD takes n/2 + n/2. So the total amount of time it takes to merge the four lists is 2n.

Time complexity of the k-way merge is, as I said, O(n log2 k). With four lists, that’s n * log2(4)log2(4) = 2, so the time to merge four lists with a total of n items is n * 2, or 2n.

If there is an odd number of lists, then one of the original lists gets merged with one of the larger, already merged, lists. For example, with five lists the task becomes:

    Merge A and B to create a list called AB.
    Merge C and D to create a list called CD.
    Merge AB and E to create a list called ABE.
    Merge ABE and CD to create the final list.

To do that with an arbitrary number of lists, we create a queue that initially contains the original lists. We then remove the first two lists from the queue, merge them, and add the result back to the queue. Then take the next two, merge them, and add the result to the queue. We continue doing that until there is only one list left on the queue. That is:

    Initialize queue with all of the lists.
    while queue has more than one item
        l1 = queue.Dequeue()
        l2 = queue.Dequeue()
        rslt = Merge(l1, l2)
        queue.Enqueue(rslt)
    merged = queue.Dequeue()

At first, it seems like that method will require a lot of extra space to hold the temporary lists. Remember, the merge algorithms I’ve shown so far don’t require much in the way of extra storage: just a little memory to hold the smallest value in each of the lists. That is, O(k) extra space. Perhaps not surprisingly, you can do the same thing with LINQ. Here’s the code.

    public static IEnumerable<T> MergeVersionTwo<T>(IEnumerable<IEnumerable<T>> lists)
    {
        // Populate a queue with all the lists.
        var theQueue = new Queue<IEnumerable<T>>(lists);
        if (theQueue.Count == 0) yield break;

        // Do pair-wise merge repeatedly until there is only one list left. 
        while (theQueue.Count > 1)
        {
            var l1 = theQueue.Dequeue();
            var l2 = theQueue.Dequeue();
            var rslt = Merge(l1, l2); // uses the two-way merge
            theQueue.Enqueue(rslt);
        }
        var merged = theQueue.Dequeue();
        foreach (var x in merged)
        {
            yield return x;
        }
    }

That code uses the two-way merge that I presented a while back.

When building the queue, we’re just setting things up. Everything is an IEnumerable<T>, meaning that no actual work takes place. LINQ’s deferred execution postpones the merging until we start to enumerate the result. And when we do enumerate the result, only one item is produced at a time. All of the intermediate merges take place simultaneously, in steps.

Here’s some code that uses the new MergeVersionTwo method:

    var l1 = new int[] {1, 3, 9, 12, 15};
    var l2 = new int[] {2, 7, 14, 16, 19};
    var l3 = new int[] {6, 8, 13, 17, 20};
    var rslt = MergeVersionTwo(new List<int[]> {l1, l2, l3});

    foreach (var i in rslt)
        Console.WriteLine(i);

If you wrap that in a method and single-step it, you’ll see that no real work is done until you get to the foreach. Even then, all that mucking about with the queue doesn’t enumerate any of the lists until you get to the foreach at the end of the MergeVersionTwo method. If you start single-stepping (in Visual Studio, press F11, or the “Step into” debugger command), you’ll see it start working on the two-way merges. Watching how all that works will help you to understand just how powerful deferred execution can be.

It’s important to note that asymptotic analysis just tells us how the run time varies with the size of the input. It doesn’t say that both of the k-way merge algorithms will take the same amount of time. n log2 k just says that for a particular algorithm the amount of time varies with the size of the input and the logarithm of the number of lists. We’ll have to do some performance tests to determine which is actually faster.

That testing should prove interesting. Before we go there, though, we need to look more closely at the pairwise merge.

Timers don’t keep track of time

In computer programming, the word Timer is used to describe a hardware or software component that raises an event (“ticks”) at a specified interval. So, for example, if you want a notification once per second, you would instantiate a timer and set its period to one second. The timer API often has you specify the period in milliseconds, so you’d say 1,000 milliseconds.

In a C# program, for example, you can create a timer that ticks every second, like this:


    private System.Threading.Timer _timer;
    private int _tickCount;
    public void Test()
    {
        _timer = new System.Threading.Timer(MyTimerHandler, null, 1000, 1000);
        Console.ReadLine();
    }

    private void MyTimerHandler(object state)
    {
        ++_tickCount;
        Console.WriteLine("{0:N0} tick", _tickCount);
    }

It’s important to note that ticks don’t come at exact 1-second intervals. The timer is reasonably accurate, but it’s not perfect. How imperfect is it? Let’s find out. Below I’ve modified the little program to start a Stopwatch and output the actual elapsed time with each tick.

    private System.Threading.Timer _timer;
    private Stopwatch _stopwatch;
    private int _tickCount;
    public void Test()
    {
        _stopwatch = Stopwatch.StartNew();
        _timer = new System.Threading.Timer(MyTimerHandler, null, 1000, 1000);
        Console.ReadLine();
    }

    private void MyTimerHandler(object state)
    {
        ++_tickCount;
        Console.WriteLine("{0:N0} - {1:N0}", _tickCount, _stopwatch.ElapsedMilliseconds);
    }

On my system, that program shows the timer to be off by about 200 milliseconds every three minutes. That’s on a system that isn’t doing anything else of note: no video playing, no downloads, video streaming, or heavy duty background jobs. 200 milliseconds every three minutes doesn’t sound like much, but that’s one second every fifteen minutes, or 96 seconds every day. My granddad’s wind-up Timex kept better time than that.

Granted, the problem could be with the Stopwatch class. But it isn’t. I created a simple program that starts a Stopwatch and then outputs the elapsed time whenever I press the Enter key. I let that program run for days, and every time I hit Enter, it agreed very closely with what my wristwatch said. There was some variation, of course, because no two clocks ever agree perfectly, but the difference between Stopwatch and my wristwatch were a lot smaller than the differences between Stopwatch and the Timer. I have since run that experiment on multiple computers and multiple clocks, and every time the Stopwatch has been quite reliable.

Things get worse. On a heavily loaded system with lots of processing going on, timer ticks can be delayed by an indeterminate time. I’ve seen ticks delayed for five seconds or more, and then I get a flood of ticks. I’ll see, for example, a tick at 5,000 milliseconds, then one at 9,000 milliseconds followed very quickly by ticks at 9,010, 9015, and 9,030 milliseconds. What happened is that the system was busy and couldn’t schedule the timer thread at 6,000 milliseconds, so the system buffered the ticks. When it finally got around to dispatching, three other ticks had come in and it fired all four of them as quickly as it could.

This can be a huge problem on heavily loaded systems because it’s possible that multiple timer ticks can be processing concurrently.

The only thing a timer is good for is to give you notification on a periodic basis–approximately at the frequency you requested. A timer pointedly is not for keeping time.

Given that, you can probably explain what’s wrong with this code, which is a very simplified example of something I saw recently. The original code did some processing and periodically checked the _elapsedSeconds variable to see how much time had elapsed. I simplified it here to illustrate the folly of that approach.

private Timer _timer;
    private int _elapsedSeconds;
    private ManualResetEvent _event;
    private void Test()
    {
        _timer = new Timer(MyTimerHandler, null, 1000, 1000);
        _event = new ManualResetEvent(false);
        _event.WaitOne();
        Console.WriteLine("One minute!");
        _timer.Dispose();
    }
    
    private void MyTimerHandler(object state)
    {
        ++_elapsedSeconds;
        if (_elapsedSeconds == 60)
        {
            _event.Set();
        }
    }

Here, the programmer is using the timer like the second hand of a clock: each tick represents one second. There are at least two problems in that code. First, we’ve already seen that the timer won’t tick at exactly one-second intervals. Second and more importantly, it’s possible on a busy system for multiple timer events to occur concurrently. Imagine what happens in this case:

  1. _elapsedSeconds is equal to 59.
  2. A tick occurs and Thread A is dispatched to handle it.
  3. Thread A increments the _elapsedSeconds value to 60.
  4. Another tick occurs and Thread B is dispatched to handle it.
  5. At the same time, Thread A is swapped out in favor of some higher priority thread.
  6. Thread B increments the _elapsedSeconds value to 61.

At this point, the _elapsedSeconds value has been incremented beyond 60, so the conditional will never be true and the event will never be set. (Well, it might: 4 billion seconds from now, when _elapsedSeconds rolls over. That’s only about 125 years.)

You could change the conditional to _elapsedSeconds >= 60, but that’s missing the point. We’ve already shown that the timer isn’t going to be particularly accurate. If you were trying to time a whole day, you could be off by a minute and a half.

The problem of concurrent execution of timer handlers is another topic entirely that I probably should cover in a separate post. It’s important you understand that it can happen, though, and you have to keep that in mind.

In programming, timers don’t keep time. Don’t try to make them. They can’t count up reliably, and they can’t count down reliably. If you want to keep track of elapsed time, start a Stopwatch and then check its Elapsed or ElapsedMilliseconds properties when necessary.

On a related note, do not use the clock to measure time.

Programs are not cats

I see questions like this fairly regularly:

I need to write a program that checks a database every five minutes, and processes any new records. What I have right now is this:

    void Main()
    {
        while (true)
        {
            CheckDatabaseForUpdates();
            Thread.Sleep(300000); // 5 minutes, in milliseconds
        }
    }

Is this the best way, or should I be using threads?

I’ve said before that Thread.Sleep is a red flag indicating that something is not quite right. And when a call to Sleep is in a loop, it’s a near certainty that there is a real problem. Most Sleep loops can be re-coded to use a timer, which then frees the main thread to do other things.

In this example, the “other thing” the main thread could be doing is waiting for a prompt from the user to cancel the program. As it stands, the only way to close the program would be to abnormally terminate the process (Control+C, for example, or through Task Manager). That could be a very bad thing, because the program might be in the middle of updating the database when you terminate it. That’s a virtually guaranteed path to data corruption.

I’m not going to show how to refactor this little program to use a timer because doing so is fixing a solution that is fundamentally broken at a higher level. How can this program be broken, you ask?

The program is broken because it’s doing a poor job of implementing a feature that already exists in the operating system. The program is occupying memory doing nothing most of the time. It naps, wakes up periodically, looks around, maybe does some work, and then goes back to sleep. It’s like a cat that gets up to stretch every five minutes and maybe bats a string around for a bit before going back to sleep.

Operating systems already have a way to implement catnap programs. In the Linux world, the cron daemon does it. Windows has Task Scheduler and the schtasks command line utility. These tools let you schedule tasks to execute periodically. Using them has several advantages over rolling your own delay loop.

  • You can change or terminate the schedule without having to recompile the program. Yes, you could make the program read the delay time from a configuration file, but that just makes the fundamentally broken program a little more complicated.
  • Task schedules are remembered when the computer is restarted, freeing you from having to start it manually or figure out how to put it in the startup menu.
  • If a catnap program crashes, it must be restarted manually. A scheduled task, on the other hand, will log the error, and run the next time.
  • The program doesn’t consume system resources when it’s not doing useful work. It’s like a cat that’s not even there when it’s sleeping. (Which would make for a cat that you don’t see very often, and only for very brief periods.)

Programs aren’t cats. If a program isn’t doing active work or maintaining some in-memory state that’s required in order to respond to requests, then it shouldn’t be running. If all your program does is poll a resource on a regular schedule, then you’ve written a catnap that is better implemented with the OS-provided task scheduler.

Memory leak in ConcurrentQueue

I learned today that there is a memory leak in the .NET 4.0 ConcurrentQueue class. The leak could explain some trouble I was having in December with a program running out of memory. I didn’t know about this bug at the time, but reducing the sizes of several of my queues seems to have made the problem go away.

I know some of you are thinking, “A memory leak in .NET?  I thought the garbage collector was supposed to prevent that!”  The garbage collector collects garbage–memory that’s no longer referenced.  It can’t collect something that’s still being used.  Let me explain.

The ConcurrentQueue class uses an array as its backing store.  When an item is added to the queue, a reference to the item is placed in the array.  As long as an item is in the queue, there is an active reference to it–some part of the program is depending on that item to be good.  In garbage collector parlance, the item is “rooted,” meaning that the garbage collector can’t collect it.

When an item is removed from the queue, that item’s slot in the array is supposed to be cleared–set to null. Since the item was removed from the queue, it makes no sense to keep a reference to the item around.  And as long as the queue keeps the reference, the garbage collector can’t collect the object.

Problem is, the code that removes items from the queue doesn’t clear the item’s slot in the array.  The result is that stale object references remain in the queue’s backing array until one of the following happens:

  1. An item added to the queue overwrites that slot.
  2. There are no more references to the queue (i.e. it goes out of scope).
  3. The array is resized. This happens automatically as the number of items in the queue grows and shrinks.

There might be other ways, but those are all that I can think of at the moment.  The point is that items “live” in the queue longer than they absolutely have to, which is a potentially serious memory leak.

I say “potentially serious” because it’s not like the queue grows without bound. At least, it shouldn’t. If your program allows for unbounded queue growth, then you have a more serious problem than this memory leak.  That said, with the existence of this bug you have to assume that your queue will always have worst-case memory usage.  That is, for the purpose of computing memory consumption, you have to assume that the queue is always full, even when you know that all items have been removed.

This can be a real problem if you use a queue to hold input items, for example, and a separate collection to hold items that have been processed from the queue. For example, imagine a loop that does this:

QueuedThing queued = GetNextItemFromQueue();
ProcessedThing processed = ProcessThing(queued);
AddProcessedItemToList(processed);

Your expectation is that, within the limits of the garbage collection interval, the amount of memory used is:

(sizeof(QueuedThing) * queue.Count) + (sizeof(ProcessedThing) * list.Count)

But with this bug, you have to design for the worst case memory usage, which is:

(sizeof(QueuedThing) * queue.Capacity) + (sizeof(ProcessedThing) * list.Count)

So if QueuedThing and ProcessedThing are the same size, then your worst case memory usage is double what it should be.

Microsoft says that the fix for this will be in the next version of .NET. Considering that .NET releases are typically every two years or so, that means at least another year before this is fixed.  Until then, the workaround is to be more careful with the way you use ConcurrentQueue (and BlockingCollection, which uses a ConcurrentQueue as its default backing store). In particular, you should be careful to limit the size of your queue and limit the scope in which the queue is active.

Coffee scoop and hillbilly

A few new carvings recently. Debra wanted a new coffee scoop, so I made her two of them. Here’s one. The other looks just like it.

The scoop is carved from basswood, and finished with something called “butcher block conditioner”–a mix of beeswax and food-grade mineral oil. I figured there wasn’t much sense in using an exotic wood for the scoop, as it’s going to get blackened from the coffee. The bowl holds about a tablespoon. The size comparison object is one of the new Presidential dollar coins.

This little hillbilly (about 3½ inches tall) is carved from a maple scrap.  Very hard wood.  I think I’ll stick to softer woods for my caricatures in the future and save the harder woods for things like dolphins and other forms that have fewer fine details.

Comparing ranges is harder than it looks

I’m developing a generic Range class in C# so that I can create comparison ranges.  The idea is that I can replace code like this:

int MinValue = 22;
int MaxValue = 42;
...
if (val >= 22 && val < 42)
{
}

with this:

Range<int> MyRange = new BoundedRange(
    22, RangeBoundType.Inclusive,
    42, RangeBoundType.Exclusive);
...
if (MyRange.Contains(val))
{
}

In isolation it doesn’t look like a win but if you’re doing that range comparison in multiple places or working with multiple ranges, some of which can have an open-ended upper or lower bound, or  bounds that can be inclusive or exclusive, working with a Range type can save a lot of trouble.

Before we go on, let me introduce a little bit of notation.

First, the characters ‘[‘ and ‘]’ mean “inclusive”, and the characters ‘(‘ and ‘)’ mean “exclusive.”  So, assuming we’re working with integral values, the range [1,30) means that the numbers 1 through 29 are in the range.  Similarly, (1, 30] means that the numbers 2 through 30 are in the range.

To specify “unboundedness,” I use the character ‘*’.  So the range [*,255] means all numbers less than or equal to 255.  And, of course, the range [*,*] encompasses the entire range of values for a particular type.  By the way, inclusive and exclusive are irrelevant when unboundedness is specified.  That is, [*,255] and (*,255] are identical ranges.

Range type also allows me to do things like compare ranges to see if they overlap, if one contains the other, etc. And therein lies a problem. Consider the two byte ranges [0,*] and [*,255].

It should be clear that, since the range of a byte is 0 through 255 (inclusive), the two ranges are identical. The first range has a lower bound of 0, inclusive, and is unbounded on the upper end.  The second range has defined bounds of 0 and 255.  As programmers, we can tell by inspection that the two ranges are identical.

Unfortunately, there’s no way to write generic code that can determine that the ranges are equivalent.  Although the primitive types available in C# all have defined maximum and minimum values (Int32.MaxValue and Int32.MinValue, for example), not all types do.   And since I might want to have ranges of strings or other types that don’t have defined bounds, there’s no general way to tell the difference between a range that has no defined upper bound and a range whose upper bound is the maximum possible value for that type.  If I wanted that ability, the only way would be to pass maximum and minimum possible values for each type to the Range constructor–something that I’m not willing to do and might not even be possible.  What’s the maximum value for a string?

The inability to determine that an unbounded range is equivalent to a bounded range makes for some interesting problems. For example, consider the two byte ranges [1,*] and [0,255].  It’s obvious to programmers that the second range contains the first range. But how would you write a function to determine that?

public bool Contains(Range<T> other)
{
    // returns true if the current range contains the other range
}

Since there’s no way to convert that unbounded value into a number, the code can only assume that the upper bound of the first range is larger than the upper bound of the second range. Therefore, Contains will return false.

Whether that’s an error is a matter of opinion.  Logically, having Contains return false in this case is the correct thing to do because “unbounded” is larger than any number.  But it seems like an error in the context of types like byte that have limited ranges, because there’s really no such thing as an unbounded byte.  Rather, byte has a maximum value of 255 and one would expect the code to know that.

As I said, it’d be possible to extend the Range class so that it knows the minimum and maximum values for the primitive types, and give clients the ability to add that information for their own types if they want.  That way, an unbounded upper value for an integer would get  Int32.MaxValue.   The code would then “do the right thing” for the built-in types, and clients could define minimum and maximum values for their own types if they wanted to.

But I wonder if it’s worth going to all that trouble.  Is it more sensible to explain the rules to programmers and expect them to define specific bounds (i.e. not use unbounded) if they think they’ll be working with ranges (as opposed to comparing values against ranges) of primitive types?