I recently had the need to delve into the world of JSON (Java Script Object Notation) to read some data from a particular Web site. For my purposes, the simple JSON reader provided by .NET worked just fine. The way it works is interesting: you call JsonReaderWriterFactory.CreateJsonReader
, and it returns an XmlReader
instance. That’s right, it converts the JSON to XML behind the scenes. Apparently there are some limitations in how it handles nested structures, but I didn’t encounter them. That’s useful thing #1.
I discovered useful thing #2 when my XmlReader
threw an exception trying to parse the JSON I fed it. I originally thought that the problem was with the JSON-to-XML conversion. But then I fed the JSON to JSONLint. It turns out that the string “It\’s an error” contains an error. Escaping the apostrophe is an error in JSON. There are only a handful of characters that can be legally escaped. It’s nice to know that the site was in error and not my JSON-to-XML converter. Either way, I still have to gracefully handle the error.
I had hoped to use the Windows command FINDSTR as a substitute for grep. No such luck. FINDSTR has two problems that make it marginally useful at best. First, there’s no switch that corresponds to grep’s –only-matching (-o) option. If you specify –only-matching, then grep outputs only the text that matches the query expression rather than outputting the entire line that contains the match. FINDSTR lacks that option, making it useless for many of the things I do.
The other problem is very odd. Both grep and FINDSTR are line-oriented tools. But FINDSTR’s definition of a line is inconsistent when working with files whose lines end with just a line feed. For example, if I’m looking for all lines that contain the text “.xml”, I’d write this:
FINDSTR /R "\.xml" file.txt
The /R switch tells FINDSTR to treat the search string as a regular expression. I could have done a literal search in this instance, but I want to illustrate the error. FINDSTR correctly finds and outputs all of the lines that contain the string “.xml”.
What I really want, though, is just those lines that end with “.xml”. So the command would be:
FINDSTR /R "\.xml$" file.txt
FINDSTR doesn’t find any lines that end in “.xml” unless I convert the file so that it has CR/LF line ends. grep correctly handles both line end conventions. Since I can’t guarantee the format of the files I work with (I often am working with files that I download with wget), FINDSTR is practically useless if I’m doing regular expression searches.
My advice, download GNU Grep for Windows.