FINDSTR: Line is too long

Today I tried to use the Windows FINDSTR command to find occurrences of a particular string in a large (33 gigabyte) text file.  Simple enough, right?

findstr /L "xyzzy" bigfile.xml

FINDSTR immediately started giving me errors:

FINDSTR: Line 408555 is too long.
FINDSTR: Line 432128 is too long.
FINDSTR: Line 801201 is too long.
FINDSTR: Line 927897 is too long.
FINDSTR: Line 939189 is too long.
FINDSTR: Line 939189 is too long.
FINDSTR: Line 939189 is too long.
FINDSTR: Line 1006538 is too long
FINDSTR: Line 1579088 is too long.

I couldn’t imagine why it would tell me that the lines are too long.  Unfortunately, I’m not able to view the file in-place because after all this time there still isn’t a decent text file viewer that can handle a file that large.  I can get kind of close with less, although there are problems displaying non-US character sets in a Windows console program.

In any case, if I extract those lines to a file (by writing a program that scans the big file and pulls out the lines in question), I was able to determine that the shortest of the lines listed above is about 3,500 characters long and the longest is about 25,000 characters.  And here’s the kicker:  running the same FINDSTR command on that file results in no errors.

I also noticed that FINDSTR told me three different times that line number 939,189 is too long.

Obviously, “line is too long” is a catch-all message for a number of different errors. FINDSTR has some issues.  Some time ago, I said that FINDSTR was marginally useful.  After today, I’d say it’s even less useful than I thought it was then.

GNU grep for Windows, by the way, has no problems with the file.  The only reason I used FINDSTR is because I don’t have GNU grep installed on the server where the file exists.

Oh, and Microsoft still hasn’t fixed that idiotic file caching bug.