Let’s settle on a newline standard

We’re, what, 70 years into the “computer revolution?” By the late ’70s, we’d pretty much settled on one of two different character sequences to denote the end of a text file line. Either a single line-feed (LF) character, or a carriage-return/line-feed pair (CRLF). Well, there was the classic Macintosh that used a single carriage-return (CR), but that’s essentially gone: the Mac these days uses the LF.

The history of line endings is kind of fascinating in a geeky sort of way, but mostly irrelevant now. Suffice it to say that by the late ’70s, Unix systems and those descended from it used LF as a newline character. DEC’s minicomputers, and microcomputer operating systems (like CP/M and later MS-DOS) used CRLF as the newline. If you’re interested in the history, the Wikipedia article gives a good overview.

And so it persists to this day. To the continual annoyance of programmers everywhere. There are Linux tools that don’t handle the CRLF line endings and Windows tools that don’t handle the LF line endings. Everybody points fingers and seemingly nobody wants to admit that it’s a problem that would easily be solved if everybody could get together and decide on a single standard.

Having come up with early microcomputers running CP/M in the early ’80s, I actually used a teletype machine as an I/O device. That machine required a carriage-return to return the print head to the leftmost position of a line, and then a line-feed to advance the paper one line. Thus, the CRLF line ending. Printers, too, required CRLF to start printing on the next line. If you just sent an LF then you’d get something like this:

The quick red fox jumped
                        over the lazy brown dog.

Instead of:

The quick red fox jumped
over the lazy brown dog.

And if you just sent a CR, you’d get the second line printing over the first. (I suppose I could try to figure out how to do that with HTML/CSS in WordPress, or post an image, but I don’t think it’s necessary. I expect you get the idea. The result is a single physical line with overprinted characters.)

I’d sure like to see the industry settle on a single standard. I don’t have a strong preference. LF-only seems like the more reasonable standard, simply because it’s one less character. It’s not like many people talk directly to printers or teletypes anymore: that’s done through device drivers. At this point, the CR in the CRLF line ending is nothing more than an historical remnant of a bygone era. There is no particular need for it in text files.

Microsoft could lead this change pretty easily:

  1. Fund a development effort to modify all of the standard Windows command line tools to correctly handle both types of line endings on input, and provide an option for each command to specify the type of newline to use on output. With the default being LF.
  2. Modify their compiler runtime libraries to intelligently interpret text files with either type of newline. And to output LF-only newlines by default, with an option of CRLF.
  3. Fund a “newline evangelism” group that advocates for the change, writes articles, gives talks, and provides guidance and assistance to developers who are making the switch.

It’d cost them a few dollars over a relatively short period of time, but it would save billions of dollars in lost time and programmer frustration.