Power carved tree ornaments

The latest issue (issue #57, Holiday 2011) of Woodcarving Illustrated Magazine has a pattern for a stylized bird that makes a great little tree ornament. I thought it would also be a good way to learn how to use my new Foredom power carver. So I spent some time with the bandsaw, cutting blanks from different types of wood, then took them over to the power carver and started shaping.

From left, the woods are mesquite, Osage orange (also known as bois d’Arc or “bodark”), mahogany, black cherry, and cedar. Click on the picture for a larger image.

I also cut one from mimosa, but already gave away the result. I gave it to the guy who gave me the wood. Fortunately, I have more of that.

The figures are about four and a half inches from beak to tail, and stand a little over two inches tall.

With the exception of the cedar, none of these woods would be considered easy to carve with edged tools. The Osage orange, though, is especially hard. I used up two sanding sleeves carving that one. But it sure is beautiful. And, yes, there’s a big crack in it. Adds character. I’m not going to throw out a perfectly good piece of wood just because there’s a crack or two in it.

I’m not about to give up my knives, but I am enjoying the power carver. Still trying to master the tool, though, and having a little trouble with symmetry. and with finishing. The pictures reveal some pits and scratches that I should have smoothed before I applied the finish. I might have to start taking pictures when I think I’m done. The pictures reveal things that I just don’t see when looking at the carving in my hand.

Debra says she wants one bird of each kind of wood I use. I wonder if she realizes how many different birds she’s going to get. I still have a piece of mulberry that I need to carve up, and there are a few other types in my stash. I’m sure there’s some walnut, at least two different types of oak, and I’ve forgotten what all else.  She might end up with a dozen birds.

 

Designing a better text file viewer

Last time, I discussed existing text file viewers and why I find them inadequate. It’s disappointing that GNU less is the closest thing I’ve found to a good text file viewer for Windows. Functionally, less is almost perfect. For sure it blows away all of the other file viewers I’ve seen. Those other tools have some good ideas, but they don’t handle large files efficiently. In this article, I’ll describe how to improve the file handling, which should make it possible to write a GUI-based file viewer that has the features of less, and better performance.

The discussion below starts very simply so as to illustrate the problems and possible solutions. The discussion ultimately converges on what I think is the most reasonable way to improve the speed of navigation within a large text file.

Note also that I’m assuming a 64-bit system, where pointers occupy eight bytes and the program has access to more than two gigabytes of memory.

Working with large files

If a file will fit into memory, displaying it is very easy. I can load the file into an array of lines and display lines on the screen as required. Allowing for line wrap adds a small bit of complexity, but it’s quite manageable. For the moment, forget that loading a large file can take quite a bit of time. At 100 megabytes a second, loading a gigabyte takes 10 seconds–an eternity if you’re a user waiting to see the file.

An array is the simplest possible data structure that lets you directly index a line by number. In most programming languages, that array will require eight bytes per line for a pointer to the actual string. The size of file you could display with this program would be limited to available memory. More specifically, if the file’s size plus 8 * LineCount is smaller than available memory, you can view the file.

Note that the line index isn’t strictly required. You could do in memory what less does on disk: start at the beginning of the file and sequentially scan it to parse out the lines. Doing that in memory would be significantly faster than reading from disk (after the file’s in memory, of course), but it would still take a very long time on a sufficiently large file.

Nobody has infinite memory, so the above won’t work very well. Typical memory on a developer’s computer these days is about eight gigabytes, so to view a very large file, you need a way to represent the important parts of a file using as little memory as is reasonably possible.

You could, if you wanted, store only the index in memory and leave the actual text of the file on disk. You’d still have to build the index, which would take just as much time as loading the entire file into memory, but you could work with a much bigger file. That is, if your average line is 80 bytes (including the line terminator), the in-memory representation of the file plus index would be 88 bytes per line. But if you’re not keeping the text, your cost is only eight bytes per line. Nothing like a 91% savings.

Such a design would likely work very well from the user’s perspective. After the wait for loading, that is. If the user asked for a particular line, the program could position to that line in the file, read it, and display a page full of data. Going to a particular line would be just as fast as positioning into the file by percentage.

Eight bytes per line is hardly free, though. My 30 GB test file has about 400 million lines in it. At eight bytes per line, the line index would occupy 3.2 gigabytes. The maximum sized file you could load on your typical developer machine would be between 60 and 70 gigabytes, and it would take about ten minutes to load the line index.

Reducing the line index size

One way to save memory would be to have the line index store the position of every other line. That would cut the size of the line index for any particular file in half. If you stored the position of every 16th line, you could make the line index 1/16th the size. You can pick any number N for the line interval, and the index size would be correspondingly reduced.

Of course, the larger N is, the more searching the program has to do for any particular line. If N is 16, then, on average, the program will have to search 8 lines of data to find the starting point. If line lengths are relatively short, this isn’t a problem at all. But if some lines are very long, the search time to find an arbitrary line could be several seconds.

Another way to reduce the size of the line index is logically divide the file into blocks of some size. Imagine, for example, that you view the file as a sequence of four-gigabyte blocks. For each block, you store its start position (offset in the file), and the line positions are stored as offsets from that starting position rather than as offsets from the start of the file. Since four gigabytes can be represented by 32 bits (four bytes), the size of each line number entry is only four bytes rather than eight as in the naive index. Your index size is cut in half, and you still have every line indexed. The only cost is a small amount of complexity and minimal runtime overhead to calculate a file position given a block offset and line offset. The seek and read time swamps that overhead–it wouldn’t be noticed.

You can do even better if you use 16-megabyte blocks. 16 megabytes is 24 bits, meaning that the line offsets can be stored in three bytes each, saving you 25% over the index with four-gigabyte blocks. There is another small runtime penalty for working with three-byte quantities, but again it wouldn’t be noticed in comparison with the disk seek and read time. Altogether, we have a 62.5% decrease in memory usage over the naive index. There’s a small amount of memory overhead involved in working with the smaller blocks, but it’s not unmanagable. Each four-gigabyte block needs 256 16-gigabyte blocks, but the savings per index entry more than offsets that.

Note that the size of the index depends entirely on the number of lines in the file. A one-gigabyte file that has 250 million lines (very short lines) would require almost two gigabytes for the naive line index. The “optimized” line index cuts quite a bit off of that, but it would still require about 750 megabytes. On the other hand, an 8-gigabyte file that has 100 million lines (about 86 bytes per line) would only need about 300 megabytes for the line index. Explaining to users why your program can load an 80-gigabyte file with no trouble, but chokes on a 10-gigabyte file would be rather awkward.

Indexing blocks

Any scheme that indexes every line is going to suffer when there are many lines, because the index will occupy a very large percentage of the total file size. And any scheme that indexes every Nth line will only reduce memory usage by a factor of N. It’s also going to suffer when lines are long, because searching for lines that follow a known line number can take an arbitrarily long time. Also, indexing every Nth line still puts you at the mercy of the number of lines. Small files with many lines will still require more index memory than large files with fewer lines.

The solution is to combine the approaches: divide the file into smaller blocks as in the blocked line index, but only store the line number at the start of the block. The size of the index is then dependent entirely on the size of the file rather than on the number of lines. In addition, we can place an upper limit on the amount of time it takes to locate a line in a block. This scheme solves both problems, at the cost of a little extra processing whenever a block is read.

Suppose we use a block size of 64 megabytes (2^26 bytes). A one-gigabyte file (2^30 bytes) would require an index with 16 entries. The number wouldn’t be less than 16 entries, but there would probably be slightly more because the line lengths aren’t uniform and lines wouldn’t pack exactly into the 64 megabyte block size. Pathological cases could require up to 32 blocks for a gigabyte, but I suspect those would be pretty rare. With a little work, I could flag the start of a block as the continuation of the previous block, but it’s unlikely that the added complexity would be worthwhile.

The index entries would be larger than the line index entries, but there are a lot fewer of them. Figure eight bytes each to store the file position and the line number, and four bytes for the block length. So figure 20 bytes per block for the index. If it requires 32 blocks per gigabyte (the absolute worst case), you’re talking 640 bytes per gigabyte. The best case is 320 bytes, and 512 bytes would easily handle all but the most pathological cases. So figure 512 (2^9) index bytes for every gigabyte (2^30 bytes) indexed.

The size of the index, then, is roughly Filesize/(2^21). So a terabyte file (2^40 bytes) would require an index of 2^19 bytes, or half a megabyte. A petabyte-sized file (1,024 terabytes, or 2^50) bytes, would only require half a gigabyte for the block index. An exabyte-sized file (2^60 bytes) would require 512 gigabytes for the index, but I figure by the time we’re working with files that large, we’ll have a lot more memory in our computers, and disk performance will be good enough that we can further increase the block size. For now, I’m happy to know that, if I were to write this, I could handle a petabyte-sized file in less than a gigabyte of RAM.

A larger block size would reduce the in-memory index size further, but there are tradeoffs that I’ll discuss below

There is one issue that the discussion above doesn’t address: lines that are longer than the buffer size. There are a couple of ways to handle that situation, the easiest being to automatically split those long lines. So if a line is 65 megabytes long, the program treats it like one line that’s 64 megabytes, followed by a second line of one megabyte. Another way is to flag a block as a continuation block. That is, store the same line number as the previous block, but set a flag to say that it’s a continuation of the previous block. There are other possibilities, none of which are particularly difficult to implement. I’ll wave my hand over the issue and call it solved.

Reviewing the requirements

Remember that my simplified requirements are:

  1. Handle arbitrarily large files.
  2. Show the user what he wants to see as quickly as possible.

If you ignore the load time, the line index described above meets both of those requirements. The index design I described above can handle a petabyte-sized file on commonly available hardware, and once the index is constructed the program would be able to show the user any line in the file almost instantly. The algorithm is very simple. The block index contains the line number at the start of each block. So the program can quickly determine in which block the requested line resides, load that block, parse the lines from it to create a per-line index for that block, and then display the line in question. A standard developer machine should be able to load and parse 64 megabytes in less than a second.

I said above that a larger block size would reduce the memory requirements, but that there are tradeoffs. The tradeoff is response time. If the user wants to position to a line that is not in the current block, the program has to load the relevant block and parse it to find the line positions in that block. A typical developer machine these days can do random reads at a rate of about 100 megabytes per second. Loading a 64 megabyte block, then, will take perhaps 700 milliseconds, leaving 300 milliseconds to parse the block and display the first page of data. If I increased the block size to 128 megabytes, it would take twice as long (two seconds) to position within the file. One second is, to me, on the edge of too slow. A two-second response time is unacceptable. If I/O speeds increase, I’m happy to increase the block size. On the same note, I might have to reduce the block size if it takes too long to read and parse 64 megabyte blocks.

The problem I’m faced with is the initial load time, which fails the second requirement. I know that I wouldn’t use the program if I had to wait five minutes or more for it to index my 30 GB file.

At this point it’s important to remember why we’re constructing an index at all. It serves two purposes: to let the user position into the file based on the line number, and to display line numbers if the user wants to see them. Being able to work with line numbers is essential, but it’s not the most important feature of a text file viewer. As we saw with less, there are many things you can do with a file viewer that don’t involve line numbers at all.

Having to wait for the file to be indexed is especially annoying when all you want to do is look at the first part of the file. Very often, that’s what a user wants to do: start at the beginning of the file and read through it sequentially. It’s unreasonable to make the user wait for all the line numbers to be computed before showing him the first part of the file.

Mimicking less

What if, when the user first started the program, it immediately read the first block of the file and presented it to the user? That’s what less does. It will even display the line numbers immediately. That’s trivial to do, of course, since reading the first part of the file, calculating line numbers for that portion, and displaying the first page is very fast. Any file viewer should do that, at minimum. Remember: show the user what he wants to see as quickly as possible.

But you can improve on less just a little bit by reading the first block (64 megabytes) of the file and parsing it to build an index of the lines in that block. Whenever the user wants to jump to a line that’s within that first block, the program already knows where that line is and has it in memory! This increases the memory footprint a little bit, as it requires that we keep track of lines in a block, but only for a single block. The worst case would be a block that contains nothing but blank lines, which would mean 64 million entries in the line index for that block, or about 128 megabytes. Not small change by any means, especially compared to less, but quite acceptable on an 8 gigabyte machine.

If the user is paging sequentially through the file, the program continues to serve lines by looking them up in the index. When the user pages beyond the part of the file that’s already been indexed, the program loads the next block, indexes it, discards the previous block, and continues. As long as the user is paging sequentially through the file, this program acts just like less, except that it’s building a persistent block index.

An obvious optimization would be to keep a few blocks of the file in memory in order to reduce the ping pong effect that would occur when the user is displaying the end of one block and the beginning of the next. Assume that this is done. I wave my hand over the technical details of the implementation because it’s not pertinent to the discussion at hand, and it’s not a difficult thing to do.

If the user then asks for the millionth (or ten millionth) line, the program again acts just like less: it makes the user wait until it has parsed enough of the file to locate the millionth line. It will still take just as long to find the millionth line as less takes, but the program is doing one other very important thing: it’s building and saving the block index.

That’s important for two reasons. First, as you saw with less, if you jump to the ten millionth line and then ask for the five millionth line, less has to start over at the beginning of the file and parse it to find line number 5,000,000. Granted, that’s pretty quick on modern hardware because the operating system will cache large parts of the file, but if you go to line 100,000,000 and then 50,000,000, you’re going to wait a long time. This proposed new program doesn’t suffer from that. Since it’s created the block index, it knows which block contains line 5,000,000 (or 50,000,000), and can jump immediately to that block.

The second reason it’s important is similar. With less, if you go to line 100,000,000 and then back to the beginning of the file, and then want line 100,000,001, the less has to scan the entire file again–all 100 million lines that it had scanned before. As I said previously, less knows where it is, but doesn’t know where it’s been. Our program, though, knows where line 100,000,000 is, and can immediately start searching for the next line at that position.

The program would, of course, allow quick positioning to the end of the file or to any point within the file by percentage, just as less does. It would also make the user wait (with the opportunity to cancel the wait) if line number display is turned on but the line numbers need to be computed.

In the above discussion, the program mimics less to begin with. It can’t do anything else. But once it’s seen a particular line in the file it knows what block that line is in and can go back to it very quickly, without having to sequentially scan through the file from the beginning.

As useful as such a thing is, it would be only a small incremental improvement over less. It’s probably worth doing and if such a program existed and provided all the other features of less, I’d use it. But it’s not enough of an improvement to make me want to write it.

Doing better than less

The only seriously annoying thing about less is that I have to wait for line numbers to be computed when I ask for a specific line, or when I have line numbers turned on and I position into the file using %. Whereas line numbers are important, I hate having to wait. The program described above reduces waits after the first time it sees a line, but that initial wait time is a serious annoyance. At 100 megabytes a second, it would take five minutes to compute line numbers for my 30 GB file.

If the user is paging sequentially through a text file, there’s no perceptible delay from one page of the file to the next. The only time he has to wait is when he wants to display a line number that the program hasn’t already seen.

So what would happen if the program jumped ahead of the user? What if, when the program starts, it reads the first block of the file, displays it to the user, and then keeps reading, building the block index as it goes? Given sufficient time, the program would finally reach the end of the file and the block index would be fully constructed. What if the program could do that in the background while letting the user view the file as before?

Doing that wouldn’t be especially difficult. Any developer machine today has multiple processor cores, so a background thread that’s scanning the file isn’t going to interrupt the responsiveness of the UI. Modern programming environments make working with multiple threads much easier than they were in the past, so implementing this improvement would be relatively easy, and the result would reduce or completely eliminate the user’s need to wait for indexing. If, for example, the user started the program and spent three minutes paging through the first part of the file, the program would have had those three minutes to index a large part of the file. Even at 50 megabytes a second (half of what you would expect from a modern developer machine), the program can index three gigabytes a minute. While the user is slowly paging through the file, the program is reading ahead, indexing the file. The result should be much smaller wait times–even the first time through the file.

The user would still be able to position to any point in the file, just as before. Again, the only time he’d have to wait would be to display correct line numbers. And with background indexing of the file, even that requirement can be relaxed.

Estimating line numbers

My 30 GB test file has an average line length of about 84 bytes, and the contents of the file are reasonably uniform. Line 100,000,000 of the file is, to a fair degree of accuracy, at about position 8,400,000,000 in the file. The actual number reported by less is 8,393,544,127, meaning that my calculation was off by 6,455,873 bytes or .07%. Then again, my average line length estimate was “about 84 bytes.” The program would have a much more accurate value.

If the program is reading and indexing the file in the background, it can keep a running average line length so that when the user asks for a line that hasn’t been read, the program can estimate the position of that line number and jump to that position in the file. Obviously, the line number display would have to indicate that the numbers shown are approximate. Would that be good enough?

It would be for me. Very often, I just want to see what’s “around” line 100,000,000, or I know there’s a particular bit of text on or near that line. If the program can put me close to that line, I can use the search facility to quickly find the exact line I’m looking for by searching in the vicinity.

The program continues to index lines in the background, and periodically updates the line number estimates. When the program gets to the block that contains the current position, it can update the display with the actual line numbers.

This is far from a perfect solution. For example, an error of .07 percent in a petabyte-sized file is a huge chunk–on the order of a terabyte. The error would be much greater if the first part of the file (the part already scanned) is not representative of the rest of the file. The program can use various adaptive techniques in an attempt to reduce the error, but nothing is going to be perfect.

The program described above would be a tremendous improvement over the venerable less, provided that it maintained all the existing functionality that makes less so useful. Adding the block index, background file scanning, and estimated line numbers are improvements that should be obvious to anybody who thinks about file viewers. Several of the file viewer tools I’ve seen implement some of what I described, but in fatally flawed ways. And that bothers me.

Background file scanning is an obvious improvement, and one that a few other programs already have, although some implement it stupidly.

The line index, confuses me. The designers of other file viewers had to realize that the line index limited the size of file their programs could handle. It looks like they gave it some thought, as none of the programs seem to use eight byte line indexes. Most appear to use four byte indexes, meaning that they’re doing something to reduce memory consumption. But even four bytes per line adds up when there are hundreds of millions of lines in the file. Why nobody thought of keeping just a block index and re-parsing blocks as required is a bit of a mystery. Or maybe they did think of it, and discarded the idea. That would be an even bigger mystery.

The one thing that doesn’t confuse me, but concerns me nonetheless, is the insistence on knowing the exact position of a line before allowing the user to position to that line. I can’t be the first person to think of using estimated line numbers.

For more than 25 years, hex file viewers have loaded the first page of a file immediately and have had the ability to position to any point in the file instantly. I know this to be true because I wrote such a program in 1983. The original version could handle a file up to four gigabytes in size, using minimal memory to do it. I had to wait about 15 years before I could prove that, since when I originally wrote the program even a 10 megabyte hard drive was a luxury that most didn’t enjoy. To be fair, a hex file viewer is a much simpler program than a text file viewer because there are no “lines” to worry about. As far as the hex file viewer is concerned, a file is just an array of bytes that’s rather slow to access.

My insight, and one that I have to believe others have had, is that a file is a file. It’s our interpretation that makes it a text file, a database file, or whatever. The only thing that prevents a text file viewer from instantly positioning to a particular line is not knowing exactly where that line is in the file. Realizing that, the designer of a file viewer make a choice: force the user wait for the line numbers to be calculated, or display text at the estimated line position. As a user, I’m not going to be completely happy with either (I want it now!), but I’m much happier to be given something rather than having to wait an eternity for the program to calculate line numbers. If other designers thought of this (and I have to believe that they did), they made the other choice. They elected not to potentially confuse users by showing incorrect line numbers.

And I agree that there’s potential for user confusion here. If the user asks for line 100,000,000 and the program puts him there instantly, he’s tempted to believe that he really is at line 100,000,000. That’s especially true if the user doesn’t realize that calculating line numbers takes time. But that user confusion is mitigated by two things:

  1. Most people who view these huge text files are programmers who will understand the issues.
  2. A simple popup message explaining that the line numbers are estimates.

It’s my contention that positioning to what the program thinks is the correct line and allowing the user to proceed from there is preferable to making the user wait a potentially very long time (and five seconds is a long time when you’re trying to get things done) before he can do anything at all.

Some thoughts on implementing the user interface

One of the problems I struggled with in working out the solution I described above is how the vertical scrollbar behaves. The horizontal scrollbar turns out to be easy to handle, but the vertical scrollbar required a bit more thought. I also initially thought that the optional line wrapping feature would complicate things. It turns out that if we change the way we think about those two features, their implementations become much easier.

Absent line wrapping, the horizontal scrollbar’s maximum value is set to the length of the longest line so far encountered in the file. It’s as simple as that. In the simplest implementation, the length is updated by the background process that’s scanning the file. However, it could also be updated when the foreground process positions to a point in the file that hasn’t yet been scanned. Whenever a previously unseen line is parsed, the scrollbar limit is updated if necessary. There’s a potential problem if a single line exceeds the scrollbar’s capacity (2^31 in Windows), but that can be handled by setting a limit on line length, or by using a the scaling technique that I describe below.

When line wrapping is turned on, the horizontal scrollbar is disabled or hidden, since there’s no need to scroll horizontally when the text wraps to the window width.

At first thought, the vertical scrollbar seems like a huge problem. If the interface uses it to reflect the line position in the file, then the scrollbar limits will have to be updated continually as the file is being scanned. In addition, using the scrollbar in that way would limit the number of lines that could be accurately represented to the maximum value accepted by the scrollbar. Using the standard Windows scrollbar, that limit is 2^31, meaning that you couldn’t work with a file that has more than two billion lines. Well, you could, but the code would be difficult.

If, instead, we treat the vertical scrollbar as representing the position as a percentage of the file’s content length, things are greatly simplified. If the file is less than two gigabytes, then the scrollbar limit is the file’s length and the scrollbar thumb position is equal to the exact byte position in the file. If the file is larger than two gigabytes, then we can use a constant multiplier for the scrollbar thumb position. The math is quite simple.

The scrollbar small change value is always 1. If the user clicks on a scrollbar arrow, the window scrolls one logical line, a “logical line” being defined by the state of the line wrapping flag. Similarly, the large change is however many logical lines fit in the window at its current size, using the currently selected font. Simple and effective.

One thing that always bothers me about Notepad is that if you turn on line wrapping in a very large file, the program takes a long time to re-compute the wrapping, and ends up moving you to the beginning of the file. Other editors and file viewers do the same thing, and it always struck me as kind of crazy that they would. It appears that the only reason they do this is so that they can make the scrollbar reflect the number of logical lines in the file–that is, the number of display lines it takes to show all of the wrapped lines in the document. It’s nuts.

There’s no other reason I can think of that going from horizontal scrolling to line wrapping would cause such a delay. Certainly, there’s no reason why the program can’t just wrap the current page almost instantly. The top line remains the same, and it’s trivial to line wrap that line and the following lines to fill the page. What comes before and after the current page is irrelevant.

If we think of the scrollbar position as the current position in the file, irrespective of line numbers, then all that complexity goes away.

Another benefit of viewing the scrollbar in this manner is that the user can grab the thumb and pull it down to, say, the halfway point. The program will behave as though the user asked to position at the 50% point in the file–just as if, in less, the user had entered 50%. Simple, not confusing, and very effective.

Will I write it?

If this program existed today, I would use it. It’s a significant improvement over less, which I consider the best text file viewer currently available. The question is whether it’s enough of an improvement for me to develop it. Writing this file viewer would entail a lot of work. The algorithms themselves aren’t especially difficult to implement, but it would take some time to get everything right.

For now, it’s just an idea. I have other fish to fry at the moment, but I sure do want this tool.

Large text file viewers

I’ve evaluated a lot of text file viewers for Windows, looking for one that has a decent user interface, a small set of “must have” features, and that can handle very large files.

One of the problems I have when looking for “large text file” viewers is that people have different definitions of “large.” Some programmers proudly state that their programs handle a 10 megabyte file with no problem. Others tout their ability to handle files that are “hundreds of megabytes” in size. And there are plenty that use one gigabyte as their benchmark. Granted, a gigabyte is a lot, but I regularly work with files that are tens of gigabytes. For example, the file I’ve been using recently to test file viewers is 30.83 gigabytes (33,100,508,272 bytes).

Many file viewers that claim to work with files “of any size” forget to add the fine print that says, “as long as it will fit into memory.” Many of those do indeed work on a 10 gigabyte file, provided you’re willing to wait five minutes for the entire file to be loaded into memory. I’ve also found that many of these tools use an inordinate amount of memory. One viewer, for example, requires about 4 gigabytes of memory in order to display a 2 gigabyte file. Why that is isn’t clear, but it certainly limits the size of file that can be viewed.

Of those viewers that don’t choke on my 30 gigabyte file, not one that I’ve tried to date has the combination of features and performance that I desire. A few come close, but fall down in the user interface or, more importantly, in quickly responding to my requests. I wrote a little about this back in April in my .NET Reference Guide. See Viewing Large Text Files. In the article, I identify many of the tools I tested, and took a closer look at three of them.

I’ve thought about this particular issue quite a bit over the years, and have always stopped short of sitting down and writing my own viewer. Why? Because it really is a difficult problem, and I have other more pressing things to do. There are user interface and implementation tradeoffs that you have to make, and although I think I could do a better job with those tradeoffs than others have done (at least, I’d be happier with my implementation of the tradeoffs), I’m not convinced that the improvements I’d make would justify the time I spent on it. It is interesting, though, to think about the problems and possible solutions.

Requirements

In the discussion that follows, I’m going to assume a minimal graphical user interface–something about on the level of Windows Notepad without the editing capabilities. I expect the following features, at minimum:

  • Intelligent handling of Unicode big-endian, little-endian, and UTF-8.
  • Ability to specify the character set.
  • Quick (“instantaneous”) load time, regardless of file size.
  • Ability to handle arbitrarily large (up to 2^64 bytes) files.
  • Vertical and horizontal scrollbars.
  • Standard movement keys (line up/down, page up/down, start/end of file).
  • Optional word wrapping.
  • Optional line number display.
  • Jump to specified line number.
  • Search for specified text.

I define “arbitrarily large” as 2^64 bytes because the most common file systems today (NTFS and the various popular Linux file systems) use 64-bit file pointers. Whenever I say “arbitrarily large file,” assume that I mean “up to 2^64 bytes.” Although I’ll limit my discussion to files that large, there’s no fundamental reason why the techniques I describe can’t apply to larger files.

One might rightly ask why I’m even thinking about files that large. It seems the height of folly to construct such big files. I tend to agree, in theory. But in practice it turns out that creating one huge XML (or some other text format) file is the easiest thing to do. The computer will be reading the data sequentially whether it comes in as one file or multiple files, but writing the program to create, manage, and transmit multiple files in order is more complicated than writing a program that handles one monolithic file. I regularly see files that are tens of gigabytes in size, and others I know talk of hundred-plus gigabyte files. I have no doubt that in ten years we’ll be talking about multi-terabyte (and larger) text files. It’s only when we want to read the files at arbitrary positions identified by line numbers that things get difficult.

Everything else is optional, although regular expression searching and a text selection and copy to clipboard feature would be really nice to have. Those aren’t particularly challenging to implement.

The character set handling is very important, although not especially difficult to implement. Notepad, for example, does a credible job of determining the text encoding, although it does guess wrong from time to time. The program should give the user the opportunity to specify the encoding if things look wrong or if he knows that the file is using a particular encoding or code page.

In reality, there are two major requirements that drive the design of the program:

  1. Handle arbitrarily large files.
  2. Show the user what he wants to see as quickly as possible.

Before we get started talking about how to meet those requirements, I think it’s important to take a closer look at what I consider the best file viewer program available today.

Viewing files the Unix way

The GNU program less, which is derived from the Unix utility of the same name, is a very effective file viewer for looking at arbitrarily large files. I’ve tested less with files larger than 100 gigabytes, and it handled them just fine. I don’t know what it would do with a terabyte-sized file, but I imagine it wouldn’t have any trouble. less is solid, uses very little memory, and works much better than many file viewers that use hundreds of megabytes of memory and don’t gracefully handle large files. It’s rather humorous and faintly depressing to see a 25-year-old command line tool written to use minimal memory outperform modern GUI tools that use hundreds of megabytes of memory. Anybody who’s working on a file viewer should study less very carefully.

less has a few features that give some ideas of how to handle very large files. One of the things that people find surprising is that less can position to any point in an arbitrarily large file very quickly. If you press > (or enter one of the other end-of-file commands) to move to the end of the file, less goes there immediately and shows you the last page full of lines from the file. It’s almost instantaneous. Similarly, entering 50% at the command prompt will take you to the halfway point in the file. Instantly.

It shouldn’t be surprising that less can position to any point in the file. After all, it’s trivial to seek into the file, read a buffer full of data, find the start of the first line in that buffer, and display it. What’s surprising to me is that so many authors of file viewers seem to think that positioning by percentage is difficult or not useful. They insist on doing things by line number which, as you’ll see, is the wrong way to think about the problem.

less can work with line numbers, too. For example, entering 50g will go to the 50th line in the file. You’ll find, though, that if you enter 1000000g, the program appears to lock up. What it’s doing is parsing the file, looking for the millionth line. And the only way to find the millionth line is to find all 999,999 lines that come before it. The program hasn’t locked up, by the way; pressing Ctrl+C will return the program’s command prompt.

The command -N will turn on the line number display. When that mode is enabled, less won’t display the results of a positioning command until it has computed the line numbers. So if you enable line numbers and then enter 50% to go to the halfway point, less will display this message:

Calculating line numbers... (interrupt to abort)

You can wait for the line numbers to be calculated, or you can press Ctrl+C to interrupt. If you interrupt, less turns off the line number display and immediately displays the page of text that you asked for.

If you ask for the millionth line and wait for it to be located, less knows where you are. If you then ask for line 1,000,100, less can find it trivially because the program knows where the millionth line is, and all it has to do is search forward 100 lines. If, however, you then ask for line 500,000, less doesn’t know where the line is. The program has to start over at the start of the file and parse the first 499,999 lines to find the position you asked for. It’s possible, by the way, that the program will search backwards. I don’t know how it’s implemented. In either case, less knows where it is in the file, but it doesn’t know where it’s been.

The above is not meant as criticism of less. The program is incredibly useful and works very well within its limitations. The program was designed to work on computers that are very primitive by today’s standards. And yet it often outperforms modern programs designed to take advantage of computers that were inconceivable 25 years ago. Granted, less has been updated over the years, but its basic design hasn’t changed.

If nothing else, programmers who are writing file viewers should study less because it does handle arbitrarily large files–something that most file viewers don’t do. In addition, the program has features that every file viewer should have but most don’t. In general, if less can do it but your program can’t, then there’s a problem with your program. Similarly, if your program doesn’t add something that less doesn’t do, or do something better than less does it, then what’s the point of writing it?

There are two major things I’d like to improve in less: character set handling, and speed of locating lines. There are some smaller features that I wish less had and that I’d add were I to write a file viewer, and I prefer a GUI, but those are relatively minor issues. Were I to embark on actually writing a file viewer to improve on less, my motivation would be speed and character set handling.

Character set handling can be tricky, but it’s a known problem with good solutions. Improving the speed of locating lines, too, doesn’t involve any heavy lifting. Several other file viewers have solved it to various extents, although their solutions are, in my opinion, fundamentally flawed in several ways. In particular, the two programs (other than less) that could view my 30 GB file both appear to create an in-memory line index, which puts serious limitations on the size of files they can view. A small file that contains a lot of very short lines requires more memory than a large file that has long lines. A terabyte-sized file with lines that average 80 characters would have about 13.7 billion lines. Naively stored, that index would require more than 100 gigabytes. With a little thought, you could reduce the memory requirement by half, but 50 gigabytes is still more memory than most servers have, and six times as much memory as a typical developer machine.

I’ve given this problem some serious thought, and I think I’ve come up with a workable solution that improves on less while using a relatively small amount of memory. I’ll explain my approach in my next posting.

Dear Programmers: Google is your friend

Stack Overflow is the best programmers’ resource to hit the Internet in quite some time. Online help forums for programmers are nothing new, but this one works better than anything else I’ve seen. I’m continually impressed by the variety and quality of the content there.

I’m also amazed by some of the questions. For example, this was posted today:

I work on my graduation thesis connected with image compression and I’m looking for algorithms, which use some mathematical methods (i.e. Discrete Cosine Transform) to achieve maximum compression ratio in minimum time and with minimal losses of quality.

Thank you in advance.

I find it difficult to believe that somebody who’s about to graduate from college doesn’t even know where to start researching his final project. As one commenter put it, “You really should be embarrassed that you’re asking for help googling for your graduation thesis.” His instructor should be embarrassed, too. In fact, the college should be embarrassed that they’re about to graduate a complete moron.

It really is amazing how many Stack Overflow questions can be answered by just typing the question into Google. For example, somebody asked today about using TBB for non-parallel tasks. The question had something to do with parallel processing, so out of curiousity I did a Google search for “TBB,” the first result of which was a link to Intel’s Thread Building Blocks library. Less than two minutes later, I had the answer. I guarantee that it took the person who asked longer to post the question than it took for me to find the answer, and I didn’t even know what TBB was!

Another one. Somebody asked how to force Windows to reboot into safe mode. I’d never needed to do that, so I didn’t know how. But a quick Google search turned up this duplicate question, which contains the answer to the question.

I’m unable to find any data that says what percentage of questions on Stack Overflow are closed as duplicates. It looks to me to be in the single digits, meaning that it probably isn’t a huge problem. Plus, duplication isn’t necessarily bad. As Jeff Atwood points out in Dr. Strangedupe: Or, How I Learned to Stop Worrying And Love Duplication, some duplication is okay. Search isn’t perfect, and it’s quite possible to ask semantically identical questions that are syntactically very different. But in many cases, including the two that I pointed out above, a quick Google search revealed the answer much more quickly than I would have obtained it by posting a question and waiting for somebody knowledgeable in that topic to respond.

Paco

Paco is from a pattern created by a wood carver named Javo Sinta. (Alternate link, which might work better: Javo Sinta Woodcarving.) This carving is five inches tall, including the base.

PacoMy version is carved from mesquite.

I cut this out on the band saw a few months ago and started carving it with a knife, but then got sidetracked. I also got a little frustrated because I made a mistake in the way I cut it out. Today I finally got some time to work with the new power carver that Debra bought me for my birthday, and thought I’d finish up Paco.

I made quite a few mistakes on this piece, but I figure it’s not too bad for my first time with the power carver. I’ll definitely have to make another one when I get better with this new tool.

More pictures in the gallery.

Christmas ornaments

I finally finished the ornaments for the annual carving swap. This year I elected to participate in the smaller group of 12, rather than in the larger group (20 to 25). Here’s the whole batch:

Clicking on the picture will give you a much better (larger) view.

I think they’re all cute, but this one’s my favorite:

The carvings are all between two and two and a half inches tall. The wood is mesquite from the back yard. I did all of the carving with a knife (no power tools). The hat, eyes, nose, and tongue are painted with acrylics, and the entire carving is given a coat of orange oil and beeswax.

I hope the others who are participating in the swap enjoy them. I’m a little worried that I inadvertently joined the Santa Swap. So far I’ve received seven ornaments, and all of them are Santas. I suppose this could be a Santa Pup.

 

Installing Windows 8 Developer Preview

I’m setting up a computer that I can use to do some work from home. I’m still running Windows Server 2008 at the office because I’m slow to embrace change on my main development machine. But this new machine is going home and I want to explore what’s new in Windows.

I figured I’d skip Windows 7 and jump right to Windows 8. So we downloaded the Windows 8 Developer Preview. Then I hit a snag. The ISO file is 4.8 gigabytes in size. It requires a dual-layer DVD. We don’t have a drive capable of writing a dual-layer DVD.

At the bottom of the download page there’s a section titled, “How to install the Windows 8 Developer Preview from an ISO image.” In it, they mention the possibility of installing from a USB memory stick.

A quick trip to Fry’s got me an 8 gigabyte USB memory stick for less than $10. That’s step 1. The next step is getting a bootable image onto the thing.

Although Windows 7 can mount an ISO device image as a drive, Windows Server 2008 (and Windows Vista) don’t have that capability. However, there are third party tools that can do it. I downloaded and installed Virtual Clone Drive, and then told it to mount the Windows 8 ISO as my drive H. Then I followed these instructions to create a bootable image on the USB memory stick.

Be forewarned: it takes a very long time to copy 4.8 gigabytes to the USB stick. I don’t know quite how long. I started it last night and it was still running when I left for home about 30 minutes later. But it was done when I got in this morning.

With a loaded USB memory stick, all I had to do was tell the other machine’s BIOS to boot from the USB device. I restarted the computer and, like magic, the Windows 8 installer started reading files from the USB port.

I’m looking forward to playing with this new version of Windows. More when I know more . . .

The white board inquisition

A lot of interviewers will have a prospective developer do a code writing exercise in which the candidate is given a series of increasingly difficult small problems to solve in code. Often, this exercise is done on a white board, although with projectors and desktop sharing widely available now, many companies are moving to having the candidate compose at the keyboard.

I participated in this back when I was interviewing programmers, because it was expected. I always found the practice uncomfortable (more so when I was the interviewer rather than when I was the interviewee), and referred to it as “the white board inquisition.” Over the years, I’ve questioned the effectiveness of this interview technique.

We would typically start with something simple. For example, we’d ask a candidate to write the equivalent to the C strlen function. This was kind of a warm up to get him accustomed to writing code on the board, to explain the rules of the test, and to see if he had even the slightest idea about programming. The code itself is very simple:

    int strlen(char *s)
    {
        int count = 0;
        char *p;
        for (p = s; *p != '\0'; ++p)
        {
            ++count;
        }
        return count;
    }

I found it surprising how many programmers couldn’t just whip that out, although I’m not sure where they had difficulty. It was obvious to me, even 15 years ago, that many of the candidates were very uncomfortable writing code on the white board, and I can understand why. I code by typing. The output of my “coding brain” is wired to my typing ability. I don’t write code, and I certainly don’t do it on the white board. I type code into my IDE. Furthermore, I’m highly dependent on my tools. Things like automatic indentation, Intellisense or other types of autocompletion, edit-time compilation to show errors, and other productivity enhancing features are essential for me to code effectively and efficiently. Even something as simple as strlen is painful to type without an IDE, and excruciating on a white board.

After strlen, we’d give a few more difficult problems. As I recall, reversing a string, determining if a string is a palindrome (which is really just the string reversal with a little twist), binary search, and a simple sort were among the exercises we’d give.

One of the more interesting things I found was that younger programmers would try to solve the binary search recursively, and the older programmers would supply an iterative solution. Almost every recursive attempt failed.

Of all the candidates I interviewed, one aced the white board inquisition. That person knew his computer science and could regurgitate code on the white board at an astounding rate. I’m pretty sure he could have reproduced a working balanced binary tree implementation in a few minutes. He couldn’t write a working program from a design specification, though, as we found out after we hired him. He knew how to implement common algorithms and data structures, but he had no concept of how to put those pieces together in order to make a useful program.

We also hired a few people who couldn’t do much more than strlen, but who did very well in other aspects of the interview. With one exception, those people turned out to be much better programmers than the ones who did well on the inquisition. The reason, I think, is because although they couldn’t write a binary search on the white board, they knew how and when to use the library-supplied binary search function. They could interpret a design specification and come up with a working program. They knew the libraries and how to use them.

In my experience, anything beyond a few very simple exercises is a waste of time. The white board inquisition could tell me if the candidate understood the fundamental mechanics of writing code, but beyond that didn’t give any indication if he could write a real, working program.

So absolutely do something like FizzBuzz on the white board to weed out the candidates who can’t write code at all, but if you want to get an idea of a programmer’s real abilities, come up with a more complete exercise (something that can be done in an hour or so), sit the programmer down at a computer that has Visual Studio or whatever development environment you expect him to be working in, and ask him to solve the problem. That will tell you if he can use the tools, if he understands the libraries, and if he can solve a real problem rather than provide a solution to some annoyingly tedious puzzle.

First rule of optimization: Don’t

A recent question on Stack Overflow reads:

Working with legacy code, I’ve found a lot of statements (more than 500) like this:

bool isAEqualsB = (a == b) ? true : false;

Does it make any sense to rewrite it like this?

bool isAEqualsB = (a == b)

Or will it be optimized?

When you see a question like this, you have to wonder what’s going through this programmer’s mind. There are several things wrong with the thought process that led to this question.

First, this is legacy code, which usually means that it’s older code written by somebody who is no longer around and that nobody else understands. It’s orphaned code. Most likely it’s also working code. The new programmer assigned to the code is either tasked with making a modification, or he’s just trying to understand it.

Second, the programmer is worried about optimizing something that he has not shown to be a bottleneck. Even if neither the C# compiler nor the JIT compiler optimizes the code, it’s highly unlikely that making the proposed change will have a significant effect on the program’s performance. This is especially true if the variables being compared are not primitive types. If they’re strings or other reference types (or value types that have an overloaded Equals operator), then the function call to do the comparison will overwhelm any possible performance gain from restructuring the code.

Making changes to working code–especially code that you don’t understand–is a risky business. Sure, this looks like a straightforward modification, although I doubt that the actual variable names are a and b. You might even be able to create an editor macro that will make the change globally in the project. But it still takes time to develop the macro, run it, make sure it compiles, and test the code to ensure that it still works. And the risk of making an error along the way, although small, is non-zero.

There is a small readability benefit to be had. The second form of the expression above is easier to read than the first. However, that benefit does not offset the time and risk associated with making the change. The time you put into making this change will never be repaid–not in readability and not in performance. It’s a waste of time.

It’s interesting to note that the C# compiler does not optimize the first expression. That is, the IL from the first expression is:

  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  beq.s      IL_0007
  IL_0004:  ldc.i4.0
  IL_0005:  br.s       IL_0008
  IL_0007:  ldc.i4.1
  IL_0008:  stloc.0

And for the second expression:

  IL_0009:  ldarg.0
  IL_000a:  ldarg.1
  IL_000b:  ceq
  IL_000d:  stloc.1

Looking at the second block of code, you might think that the evaluation doesn’t require any branches. Whereas it’s true that there are no branching instructions in the generated IL, there is a branch by the time you get to the native code. It’s hidden in the ceq instruction.

The .NET Boolean type (the C# bool is just an alias) is an 8-bit value that stores 1 for true and 0 for false. This is different from C, in which any non-zero value is considered to be true. When the ceq instruction is compiled, it generates assembly language code that’s equivalent to:

    cmp ax,bx
    jz equal
    mov al, 0
    jmp done
equal:
    mov al, 1
done:

The JIT compiler probably generates a slightly more efficient version of the code than that, but there is a branch. There has to be.

Lessons to learn from this:

  • Restrict your optimization efforts to things you know are performance bottlenecks.
  • Modifying code is a time consuming and risky business. Be sure that your modifications are worth the cost of making them.

 

When theory meets practice

A fairly common problem in programming is selecting the “best” items from a large group. For example, I have an application that makes video recommendations based on user input. The program takes the user’s input (typically the name of a song, an artist or a band), matches that up with our index of videos, and then does some calculation to come up with a relevance score for each video. The details of that calculation are interesting, but will have to be covered another time.

When all the calculation is done, I’ll have a list of perhaps two million possible video recommendations, and I need to select the top 200 (based on relevance score) to present to the user. There are several ways to do this, as described in Selecting k smallest or largest elements.

In the discussion below, n is the total number of items in the list and k is the number that I want to select. Using the numbers above, n would be equal to 2,000,000, and k would be 200.

When I first wrote the code, I used the simple heap selection algorithm. In pseudo code, it looks like this:

Create an empty min heap
Add the first k items from the array to the heap
for items k+1 to n
    if (item > smallest item on the heap)
        Remove smallest item from the heap
        Add new item to the heap

That algorithm performs quite well with our data. It is somewhat sensitive to order. If the recommendations in the items array were sorted by relevance score, this would perform very poorly because every item would have to be added to the heap, and all but the last k items would be removed from the heap. Fortunately, it would be highly unlikely to see such an ordering in our data.

I was discussing this problem with a friend recently who suggested that my algorithm was less than optimal and that I should look into a better approach.

In big O notation, that heap selection algorithm’s complexity is O(n log k). Adding an item to a heap of size k takes up to log(k) operations, as does removing an item. And since in the worst case every item gets added to and removed from the heap, there will be n insertions and n removals. That’s 2*(n log k), but constant factors are removed in big O notation, so we’re left with O(n log k).

There are at least two algorithms listed in the Wikipedia article that have better theoretical running times than my simple heap selection technique. One technique involves turning the array of two million items into a max heap and then doing a breadth-first traversal of the first k nodes. Turning an array into a heap is an O(n) operation, and the breadth-first search is O(k log k), giving us a complexity of O(n + k log k).

The other interesting algorithm is called QuickSelect. It uses the Quicksort partitioning algorithm to partition the array in linear time. QuickSelect is based on the idea of finding the median element in an array, which you can do in linear O(n) time. Given the median element, partitioning an array such that all items less than the median are to the “left,” and all items greater than or equal to the median are to the “right” is also an O(n) operation. That’s not exactly how QuickSelect works, but that’s the idea behind it. I should note that QuickSelect has expected linear time. Its worst case, though (when data is ordered or partially ordered), is really bad.

So in theory, the fastest algorithm should be QuickSelect, followed by the breadth-first heap search (which I call BFSHeap), followed by the simple heap selection algorithm (HeapSelect). I thought I’d do some tests to see how well the theory reflects reality.

The BFSHeap turned out to be a bad idea. Although the theoretical complexity analysis tells us that it should be faster than HeapSelect, the O(n) time to build the heap has a huge constant. Whereas the time to build the heap is proportional to the number of items, that “proportion” is very large. It takes more than twice as long for BFSHeap to turn the array into a heap as it does for QuickSelect to select items. Because of that, I didn’t include BFSHeap in my detailed analysis. It’s interesting to note, though, that BFSHeap performs better than HeapSelect when k exceeds about 8% of n. But it’s still much slower than QuickSelect.

QuickSelect, like QuickSort, can be inefficient if the data is sorted or reverse-sorted, and is also sensitive to ordered (or partially ordered) subranges. A simple median of three (or, better, median of five) pivot selection reduces or eliminates those cases. I used a median of three in my tests. Median of five will give slightly better average performance, and much better worst case performance.

HeapSelect, as I noted above, also performs poorly with ordered data, although it’s not as sensitive to partially ordered subranges as is QuickSelect.

Here’s the data for those of you who don’t want to read the rest of this note. In short, HeapSelect is faster if you’re selecting up to about 1% of the items in the list. QuickSelect is faster beyond that, and hugely faster before you get to 10%. HeapSelect has the disadvantage of requiring O(k) extra space, but when k is small, that extra space won’t normally be a problem.

Time (milliseconds) to select k items from an array of 2,000,000
20 200 2,000 20,000 200,000
QuickSelect (random) 24.36 25.66 25.18 25.85 28.54
HeapSelect (random) 4.97 5.23 7.34 24.49 139.06
QuickSelect (sorted) 15.18 15.02 14.97 14.95 14.97
HeapSelect (sorted) 107.91 196.84 235.28 293.36 330.93
QuickSelect (reverse) 7.81 7.73 7.63 7.65 7.60
HeapSelect (reverse) 5.11 4.92 5.07 6.23 22.81
QuickSort (random) 356.90 357.95 357.96 358.24 358.01
QuickSelect1 (sorted) 153.58 1,520.43 n/a n/a n/a
QuickSelect1 (reverse) 80.56 791.03 n/a n/a n/a

All of my tests were written in C# and run in release mode with the 64-bit runtime in .NET 4.0. Each test was run 100 times after an initial run to eliminate JIT times, and the time reported is the average of those 100 runs.

The QuickSort is included for comparison. Obviously if you sort the data then taking the top k items is trivial. But it’s expensive. In the average case, QuickSelect was more than 10 times as fast as QuickSort.

As noted, QuickSelect is an O(n) algorithm. Given an array of size n, it will take the same time to run, regardless of how many items you want to select from the array. Selecting 200 items from an array of 2,000,000 takes just as long as selecting 200,000 items from the same array. My QuickSelect implementation running on the configuration I noted above runs at about 13 milliseconds per million items. Selecting from an array of 100,000 items takes about 1.3 milliseconds. With 2,000,000 items, it takes about 26 milliseconds. The times are very consistent across a large range.

QuickSelect1 is a modified QuickSelect without the median-of-three pivot selection. I included the timings to show how poorly a naive QuickSelect implementation will perform when presented with ordered data. I only included timings for selecting 20 and 200 items, as it’s obvious where things are heading: to select 2,000 items would take 10 times as long as it took to select 200 items. QuickSelect with a median-of-three pivot selection reduces the likelihood of running across these pathological cases, but it’s still possible to hit them in real-world data. Check out median of three killer for more information. A median-of-five pivot selection would be better.

HeapSelect is O(n log k) in the worst case, meaning that the more items you want to select, the longer it will take. Selecting 200 items will be a lot faster than selecting 200,000 items. How much longer is an interesting calculation. Based on the theoretical worst-case running times, the difference should be log(200000)/log(200), or somewhere in the neighborhood of about two and a half times as long. But that’s only true in the worst case–when the data is in sorted order. The real running time is much more skewed.

The running time of HeapSelect is dominated by the number of items added to the heap. When the data is unordered, the actual number of items added to the heap is proportional to k–the number of items to be selected. When k is small in comparison to n, the number of items added to the heap is also small. For example, here are some sample values, again run against an array of 2,000,000 items. m is the number of items actually added to the heap.

k m m/k
2 29 14.5
20 249 12.45
200 2,038 10.19
2,000 15,831 7.915
20,000 112,093 5.605
200,000 660,508 3.303

When selecting 200 items, only about 2,000 items are added to the heap. When selecting 200,000 items, 660,000 items are added to the heap: about 300 times more items. This explains why the actual running times are so much different from the theoretical running times. It doesn’t take 300 times as long to select 200,000 items, but it does take 25 times longer.

Note also that when k is 2, m is about 15 times k. As k increases, m also increases, but not in proportion. When k reaches 200,000, m is only about 3 times as large. And, of course, when k is the same size as the array, m == k.

In my initial tests, HeapSelect used a generic BinaryHeap collection class similar to this one. In an attempt to level the playing field (after all, QuickSelect was hand-coded), I moved the relevant portions of that BinaryHeap collection in-line so that I was operating on the items array directly. The result is between three and four times as fast as the original BinaryHeap. This resulted in HeapSelect being faster than QuickSelect when k is up to about one percent of n. With the original version QuickSelect is faster when k exceeds one-tenth of one percent of n.

The primary reason for the difference in run time is that the custom heap code uses an array rather than a generic list, and there is minimal error checking. In addition, I was able to combine the “remove lowest and add new” sequence into a single “replace” operation–something that the generic BinaryHeap implementation didn’t supply.

The ideal selection algorithm, would be a hybrid that uses the best approach based on how many items are being selected. I haven’t seen such a thing. There is a hybrid, Introselect, based on David Musser’s Introsort, which gives better performance than QuickSelect if the data is ordered, but it’s unclear to me whether it would perform better than HeapSelect when k is a very small in comparison to n.

HeapSelect, by the way, has a couple of other advantages over QuickSelect and similar algorithms. First, it doesn’t modify the array. QuickSelect moves things around, which might be a bad thing. You can of course make a copy of the array and use QuickSelect on that, but then you’re using O(n) extra space. If you can’t modify the input array, then HeapSelect becomes much more attractive.

On a similar note, HeapSelect works if you don’t know the total number of items to be examined, or if you can’t hold the entire list of items in memory. Say, for example, you want the top 200 items from a file of a several billion records–more records than will fit in memory. HeapSelect shines in this situation because it can start at the beginning and go through the file sequentially. When it reaches the end of the file, the heap will contain the top 200 records. Another example is selecting the top n things that you see over time, for example records coming in over a network stream. With HeapSelect, you don’t need to keep track of every item–just the top k that you’ve seen so far.

The lesson here is that, whereas it’s important to know the theoretical complexity of the algorithm that you’re using, actual running times vary based on the implementation and on the nature of the data. In this case, the “inferior” HeapSelect outperforms QuickSelect when the number of items to select is small in proportion to the total number of items. QuickSelect is a better general-purpose selection algorithm and the one I would select if I could only have one, but HeapSelect is clearly better for a large number of real-world situations. It’s common to want the top .01 percent of items from a large list, and HeapSelect is four or five times faster than QuickSelect in that case.

It’s also important to remember that big O notation ignores constant factors. O(n) means that the running time will be proportional to n: for every order of magnitude change in n, the running time will increase by an order of magnitude. An O(n log n) algorithm is, theoretically, less efficient than an O(n) algorithm. However, big O doesn’t say, what the “proportion” is. For example, the real running time of that O(n) algorithm might be 100,000 * n whereas the real running time of the O(n log n) algorithm is 10 * n * log(n). The O(n log n) algorithm will be faster than the O(n) algorithm until 10 * log(n) exceeds 100,000.

So know your algorithms, but also know your data. When possible, select the algorithm that will give you the best performance for your expected application rather than the algorithm that gives the best overall performance.

Source for QuickSelect and HeapSelect are below.

QuickSelect source

static int QuickSelect(int[] items, int k)
{
    return QuickSelect(items, 0, items.Length - 1, k - 1);
}

static int QuickSelect(int[] items, int left, int right, int k)
{
    // get pivot position
    int pivot = Partition(items, left, right);

    // if pivot is less than k, select from the right part
    if (pivot < k) return QuickSelect(items, pivot + 1, right, k);

    // if pivot is greater than k, select from the left side
    else if (pivot > k) return QuickSelect(items, left, pivot - 1, k);

    // if equal, return the value
    else return items[pivot];
}

static int Partition(int[] items, int left, int right)
{
    int i = left;
    int j = right;

    var Swap = new Action<int, int>((l, r) =>
        {
            int temp = items[l];
            items[l] = items[r];
            items[r] = temp;
        });

    // pick the pivot point and save it
    int pivot;
#if MEDIAN_OF_THREE
    // Median of three optimization improves performance in general,
    // and eliminates worst-case behavior for sorted or reverse-sorted data.
    int center = (right + left) / 2;
    if (items[center] < items[left])
        Swap(center, left);
    if (items[center] < items[right])
        Swap(center, right);
    if (items[left] < items[right])
        Swap(left, right);
    // median of [left], [middle], [right] is now at [left]
#endif
    pivot = items[left];

    // until the indices cross
    while (i < j)
    {
        // move the right pointer left until value >= pivot
        while (items[j] < pivot && i < j) --j;

        // move the right value to the left position
        // increment left pointer
        if (i != j)
        {
            items[i++] = items[j];
        }

        // move the left pointer to the right until value < pivot
        while (items[i] >= pivot && i < j) ++i;

        // move the left value to the right position
        // decrement right pointer
        if (i != j) items[j--] = items[i];
    }
    // put the pivot holder in the left spot
    items[i] = pivot;

    // return pivot location
    return i;
}

HeapSelect source

 

// Used to keep track of total heap additions
static int NumHeapAdds = 0;
static int[] HeapSelect(int[] items, int k)
{
#if !CUSTOM_HEAP
    var heap = new BinaryHeap<int>();
    for (int i = 0; i < k && i < items.Length; ++i)
    {
        heap.Insert(items[i]);
        ++NumHeapAdds;
    }
    for (int i = k; i < items.Length; ++i)
    {
        if (items[i] > heap.Peek())
        {
            heap.RemoveRoot();
            heap.Insert(items[i]);
            ++NumHeapAdds;
        }
    }

    return heap.ToArray();
#else
    int[] resultHeap = new int[k];
    int heapCount = 0;

    var Insert = new Action<int>((newItem) =>
        {
            int i = heapCount;
            resultHeap[heapCount++] = newItem;
            while (i > 0 && resultHeap[(i - 1) / 2] > newItem)
            {
                resultHeap[i] = resultHeap[(i - 1) / 2];
                i = (i - 1) / 2;
            }
            resultHeap[i] = newItem;
        });

    var ReplaceRoot = new Func<int, int>((newItem) =>
        {
            int rslt = resultHeap[0];
            int tmp = newItem;
            int i = 0;
            while (i < heapCount / 2)
            {
                int j = (2 * i) + 1;
                if ((j < heapCount - 1) && (resultHeap[j] > resultHeap[j + 1]))
                {
                    ++j;
                }
                if (resultHeap[j] >= tmp)
                {
                    break;
                }
                resultHeap[i] = resultHeap[j];
                i = j;
            }
            resultHeap[i] = tmp;

            return rslt;
        });

    // put the first k items on the heap
    for (int i = 0; i < k && i < items.Length; ++i)
    {
        Insert(items[i]);
        ++NumHeapAdds;
    }

    // and then check the rest
    for (int i = k; i < items.Length; ++i)
    {
        if (items[i] > resultHeap[0])
        {
            ++NumHeapAdds;
            ReplaceRoot(items[i]);
        }
    }

    return resultHeap;
#endif
}


Whittle Pup takes five

Debra and I went to Zilker Botanical Garden on Sunday, and took Whittle Pup along. He got tired after wandering around for a while and asked to take a break with his frog buddies. The frogs didn’t show up, but Pup had a good time hanging out in the pond.

(click for larger image)

When asynchronous calls complete synchronously

The .NET Framework has a rich asynchronous programming model (APM) that’s about as close to “fire and forget” as you’re likely to want. True ”fire and forget” usually ends up being “fire and forget until it crashes,” so you want notification when an asynchronous task completes.

There are other asynchronous design patterns in .NET, including an event-based asynchronous pattern, the BackgroundWorker, and the Task Parallel Library, but the pattern I linked above has been around since the beginning and is still quite useful.

I call this the “callback design pattern.” The idea is that you tell the system, “perform this operation on a separate thread and call me back when it’s done.” In the example below, the MyThing class has BeginTask and EndTask methods that operate as recommended by the APM.

void DoSomething(MyThing thing)
{
    IAsyncResult ir = thing.BeginTask(TaskDoneCallback, thing);
    // Not doing anything with ir, yet.
}

// This method is called when the operation has completed.
void TaskDoneCallback(IAsyncResult ir)
{
    var thing = (MyThing)ir.AsyncState;
    var rslt = thing.EndTask(ir);
    // do something with the computed result
}

The idea here is pretty simple. The call to BeginTask starts the operation executing on a separate thread. When the operation is done, it calls TaskDoneCallback (still executing on that other thread), which can then process the result as required.

The above works great. Let me add a new wrinkle to make it more interesting.

Imagine now that you want the asynchronous task to run continually so that it can generate data whenever it’s available. What you want is for the callback to start a new asynchronous task, like this:

void TaskDoneCallback(IAsyncResult ir)
{
    var thing = (MyThing)ir.AsyncState;
    var rslt = thing.EndTask(ir);

    // Start a new async task
    thing.BeginTask(TaskDoneCallback, thing);

    // now do something with the computed result
}

Here, the callback function gets the results of a task, starts a new asynchronous task, and then processes the results returned by the first task.

Again, that all works great. You can call BeginTask once in your main program, and that task gets performed forever until you cancel it. Assuming, of course, that you’ve added the cancellation code.

Okay, so it works great most of the time. There’s an edge case in which this can lead to unbounded stack space usage.

You see, there’s the possibility that an asynchronous call could complete synchronously. That is, something in the runtime environment determines that there’s no need (or perhaps no ability) to spin up a new thread so that it can execute the task asynchronously. Instead, it will execute the task on the calling thread. When that happens, TaskDoneCallback is executed on the calling thread. If this happens repeatedly, then the main thread begins looping and consuming stack space. It behaves as though you’d written this:

void BeginDoTask()
{
    TaskDoneCallback();
}

void TaskDoneCallback()
{
    BeginDoTask();
}

That, as you well know, will blow the stack in short order.

The solution to this problem is kind of ugly:

void AsyncLoop()
{
    for ( ; ; )  // Infinite loop!
    {
        IAsyncResult ir = thing.BeginTask(TaskDoneCallback, thing);

        // see if it completed synchronously
        if (!ir.CompletedSynchronously)
            break;

        this.InvokeEnd(ir);
    }
}

void TaskDoneCallback(IAsyncResult ir)
{
    if (ir.CompletedSynchronously)
        return;

    this.InvokeEnd(ir);

    this.AsyncLoop();  // Start another async operation
}

void InvokeEnd(IAsyncResult ir)
{
    var thing = (MyThing)ir.AsyncState;
    var rslt = thing.EndTask(ir);

    // now do something with the computed result
}

If the method executes asynchronously, then things work exactly as I showed earlier. But if the method completes synchronously, the callback function executes immediately and the AsyncLoop loop calls InvokeEnd to process the data.

There is one other difference of note in this code: it processes the computed result before making the next asynchronous call. The result is if the “do something with computed result” takes any significant time, the program could lose data. This is a good argument for making your processing take as little time as possible. If it’s going to take more than a few milliseconds, you probably want to queue the item for processing by another thread.

In truth, I’ve never seen this infinite recursion problem happen, and off the top of my head I can’t imagine a real-world scenario in which it would be a problem. I can, however, contrive some cases. Although I can’t see myself writing code that would suffer from this problem (basically, writing a tight loop as a chain of asynchronous calls), I can imagine others doing it.

I’d put this one far down on the list of things that can go wrong with your asynchronous code. If you’ve written code that works as I described initially, then I wouldn’t worry about changing it to the new model unless all the other bugs in your program are fixed.

Pie server in mesquite

This is a pie server, carved from a piece of mesquite I picked out of the firewood pile. Total length is about ten inches. The blade is just a little over 2-1/4 inches at its widest point.

I mentioned a week or so ago that I was having difficulty cutting a straight line with the bandsaw. I still need a bit more practice, but the biggest help was getting a better blade. I was using a cheap 1/4″, 6 tooth-per-inch blade that I picked up at Lowe’s. I broke that blade on Saturday and replaced it with a 3/16″, 4 TPI blade that I had hanging on the wall. I thought that blade was bad, because I was having trouble with it. But my troubles were apparently due to the bandsaw being misaligned, and I fixed those problems last month.

That 3/16″ blade wasn’t cheap ($25 or $30), but it makes a huge difference. Not only does it cut straighter, it cuts cleaner and faster, even through a piece of mesquite that was 6″ thick. No more cheap blades for me.

I cut the rough shape of this pie server out on the bandsaw, used a knife to clean up the cuts a bit and shape the handle, then used a belt sander clamped upside-down on the bench to flatten the blade. The rotary tool finished shaping the handle, and then it was a bunch of hand sanding up to 600 grit before finishing with Howard Butcher Block Conditioner.

The cracks radiating from the knot were fairly deep. I filled them with super glue, sanded it smooth, and repeated the process twice more to get all the cracks filled in. I’m rather happy with the result.

 

Christmas dog

It’s Christmas ornament time again. I’ll be making quite a few this year, including a dozen or more of these little dogs.

Like most of my little dogs, this one is just a little over two inches tall. The wood is mesquite. The hat, nose, and tongue are painted with acrylics and the entire piece is finished with Howard Feed ‘N Wax (a mixture of orange oil and beeswax).

This particular dog is the prototype. It should have a little metal loop at the top so it can hang from the tree, but I didn’t drill the pilot hole deep enough and the brass screw eye I tried to insert broke off. I’ll be sure to drill a deeper hole in the rest of them.

Getting new videos from YouTube

YouTube has a very rich API that developers can use to get information about videos, post new videos, etc. Lots of Web sites use this API to do all kinds of wonderful things with YouTube videos.

So suppose you wanted to create a little application that works like an “all news” channel. But instead of just showing you CNN or MSNBC or whatever channel, it gathers news from lots of different channels and aggregates them. You then have an “all news all the time” channel that shows lots of different stories and will give you different viewpoints on the same story. Fox news, for example, will have a much different take on a story than will CNN or Al Jazeera.

The YouTube API gives you at least three ways to get new videos from a particular YouTube user. The simplest is to do a search and filter it by the YouTube author name. For example, this query:

http://gdata.youtube.com/feeds/api/videos?orderby=published&max-results=50&author=AssociatedPress&v=2

will give you the 50 most recent videos that were posted by the YouTube user “AssociatedPress”.

There’s a problem, though: the results in that feed will be delayed sometimes by more than 15 minutes. If I go to the AssociatedPress page on YouTube, I’ll often see videos there that do not show up in the feed returned by that query.

You can get up-to-date results by querying the individual user:

http://gdata.youtube.com/feeds/api/users/AssociatedPress/uploads?orderby=published&max-results=50&v=2

Using that query, I’m able to get the most recent videos. Anything that shows up in the YouTube page for that user also shows up in the results I get back from the query.

But that’s just one source! If I want my news channel to get data from 50 different sources, I’d have to poll each one individually. That’s not terribly difficult, but it takes time. YouTube places limits on how often I can poll for videos. To keep from being throttled, you have to wait at least 15 seconds (a minute or longer if you don’t have a developer API key) between requests to the API. So getting the latest videos from all 50 sources will take somewhere between 12 and 50 minutes. Perhaps longer.

Not many sources post a video every 12 minutes, so there’s a lot of waste involved in polling all the time. And 50 minutes is way too long to wait for an update. 50 minutes? It’s not news anymore!

There is a third way. Go to your YouTube account and subscribe to the sources that you’re interested in. According to the documentation, you can subscribe to up to 2,000 different sources, and whenever you go to your subscriptions page you’ll see the most recent videos from those sources.

The really nice thing is that you can then do a single query to get the most recent videos from all your sources:

http://gdata.youtube.com/feeds/api/users/YourUserName/newsubscriptionvideos?orderby=published&max-results=50&v=2

Replace YourUserName in the URL above with the name of the user who subscribed to the sources you want to see.

With this, you can poll YouTube once every five minutes and get the most recent videos from all of your sources.

Another benefit is that you don’t have to change your program if you want to add or remove sources. All you have to do is log in to YouTube and edit your subscriptions.

Reduce your bandwidth and get your videos more timely. Sounds like a great deal to me.

Fun with the bandsaw

Browsing in the store the other day, Debra showed me a set of bamboo toast tongs that she liked. I suppose I knew such things existed, but I never considered using them. I do, however, like a challenge.

So yesterday I pulled out the bandsaw (now on wheels, so it’s easy to move around) and experimented with a piece of scrap lumber. Sure enough, if I cut it right the wood would flex and not break.

I ended up making two sets of tongs. The set on top is made from mesquite and is nine inches long. The other set is mahogany, a little over eight inches long. The thing at the bottom of the picture is a simple spatula made from mesquite.

I’m beginning to believe that it’s impossible to cut a straight line with a bandsaw. At least, I haven’t figured out how, yet. Maybe I just need more practice.

The tongs are easy to make. I’ve made a simple pattern for a set that’s nine inches long. The hole in the center is 3/4″ in diameter, and the arms are 3/16″ wide. I made these from wood that’s 3/4″ thick.

Tape or glue the pattern onto a piece of wood. Then, clamp a piece of wood to the back side and drill the 3/4″ hole at the top. The second piece of wood is clamped there to keep the piece from splintering when you drill through it.

After you’ve drilled the hole, cut the outside pattern first. Then cut the inside. If you cut the inside first, the wood tends to flex a bit when you’re running it through the bandsaw to cut the outside.

If you want to create more than one set of tongs, you can save some wood by laying them out interleaved. I’ve created layouts for two sets and three sets. If you want to do more, just continue the pattern.

The Greatest Show on Earth

I just finished Richard Dawkins’ book The Greatest  Show on Earth, in which he explains the evidence for evolution. In fact, that’s the subtitle: “The Evidence for Evolution.”

I’ll mention here that I’ve long been a “believer” in evolution, although it turns out that my understanding of the theory and knowledge of the evidence were sadly lacking. Not to worry. The first few chapters corrected both of my errors.

The first part of the book explains the large concepts on which the theory of evolution is based: non-random natural selection of random mutations over a very long time. Evolution, as Dawkins points out multiple times in the first few chapters, is a slow process. There are no “revolutionary” changes over short time periods. It takes thousands of generations to evolve significant change, and even longer to evolve a new species.

After explaining the basic concepts, Dawkins begins on the evidence: how we know this is what has happened. As it turns out, the evidence for evolution is much stronger than I thought it was. We have radioactive clocks, DNA, tree rings, many different experiments, the fossil record, and many other sources of evidence, all of which agree on the fact of evolution. There might be disagreements on some particulars, but all of the evidence points to the conclusion that life on this planet evolved from one common ancestor.

Something that’s long bothered me about others’ reactions to science is their assertion that “it’s only a theory.” As a result, I was pleased to see that the title of the first chapter is ”Only a Theory?” In it, Dawkins’ explains exactly what a scientific theory really is.

The Oxford English Dictionary, as Dawkins points out, gives two definitions of the word “theory:”

  1. A scheme or system of ideas or statements held as an explanation or account of a group of facts or phenomena; a hypothesis that has been confirmed or established by observation or experiment, and is propounded or accepted as accounting for the known facts; a statement of what are held to be the general laws, principles, or causes of something known or observed.
  2. A hypothesis proposed as an explanation; hence a mere hypothesis, speculation, conjecture; an idea or set of ideas about something; an individual view or notion.

The “disconnect” when uneducated people say, “It’s only a theory,” is that they’re using the second definition of “theory”–the common usage that we hear every day. To them, a theory is an hypothesis or just idle speculation. A scientific theory, on the other hand, is much more rigorously defined. A scientific theory is the first sense of the word, above.

The theory of evolution is much more than idle speculation. It is an explanation of observed phenomena that has been confirmed by further observation and experiment. It is accepted by all serious scientists as accounting for the known facts.

The theory of evolution is no more and no less of a “theory” than the theory of gravity, the theory of continental drift, or Einstein’s special theory of relativity. All of these are based on observation and experiment, and serve as an explanation of the known facts.

One very important feature of a scientific theory is that it be disprovable. If you can find evidence that contradicts the theory, then at least part of the theory is incorrect. We’re seeing this right now with the recent experiments at CERN, where experimental results seem to show that some subatomic particles can travel faster than the speed of light, which contradicts Einstein’s special theory of relativity.

So far, all of the observations and experiments agree with evolutionary theory. There is no credible evidence to support any competing theories, and no evidence to contradict the idea that life on this planet evolved over billions of years, from very simple forms to the many and varied forms of life that you see today.

Dawkins spends some time (I think too much time) beating on what he calls “history deniers:” those people who, for one reason or another, deny the evidence for evolution. I’ll grant that I was amused by it early on in the book, but by the time I got to the end I was tired of hearing about it. There’s a fine line between showing how “competing ideas” don’t match the evidence, and ridicule. I think Dawkins crosses that line, and in my mind it detracts from the otherwise excellent explanations that he provides.

I sympathize with Dawkins’ desire to correct the collective ignorance that people cling to so dearly. If I didn’t run across it every day, I’d find it inconceivable that so many people deny the fact of evolution. It happened. It’s still happening!

I highly recommend The Greatest Show on Earth. If, like me, your understanding of the theory is based on the watered down explanation you got in your high school science class and the vehement denials that you get from others, then you’ll undoubtedly learn something from reading the book. It sure opened my eyes, giving a much better understanding of what the theory says happened, and the evidence that scientists use to back up that explanation.

 

MSDN “Cannot Service Request” error

As I noted in my previous post, I’ve been having some trouble accessing MSDN information from my browser. I thought I’d solved the problem, but it came back this morning. I had been viewing pages on MSDN for a couple of hours while working, and at some point I clicked on a link and got the “Cannot Service Request” page again.

This happened on Chrome, and when I went to Internet Explorer to see if I could get to the page, IE wouldn’t display it either. Firefox, again, had no trouble.

My solution for IE, then, was to go into Tools -> Options, and delete the browsing history. I just deleted everything. Restarted the browser, and I was able to visit MSDN again.

With Chrome, I thought I’d try to narrow down the problem. I first went into Options and selected “Empty the cache.” That didn’t solve the problem. So I checked “Empty the cache” and “Delete cookies and other site and plug-in data.” That worked!

Apparently, something in the MSDN site is setting a cookie or storing other site-specific data that at some point causes the JavaScript to throw up its hands.

This is almost certainly a bug with the MSDN JavaScript. But at least now I know how to treat the symptom. It ticks me off, though, that I have to delete all my cookies and site-specific data in order to keep using the site. Maybe next time the problem occurs, I’ll see if I can delete just the MSDN-specific stuff.

 

Can’t get help!

08/16/2011 – Problem solved. See below.

I don’t know why, but about a week ago I started getting a page that says “Unable to service request” whenever I try to get to Microsoft’s online documentation. At first it was just the MSDN documentation. Now it’s most any page that start with http://msdn.microsoft.com/.

This problem is very strange. If I try to visit the site with Google Chrome or Internet Explorer, I get the error. No problem with Firefox, though. It’s interesting to note that I haven’t fired up Firefox in many months. I wonder if my Chrome and IE installations are somehow corrupt?

I can access MSDN from other computers in the office, using IE or Chrome. And I can access it from my machine using curl, wget, and my own custom download program. It’s just IE or Chrome on my computer.

Any insight into the problem? I’m stumped. I suppose I could reinstall Chrome and see if that solves the problem.

Update 09/14: For Internet Explorer 8, I just had to turn on compatibility view. Perhaps it’s time to upgrade to IE 9? Firefox 3.6.18 (latest in the 3 series) works just fine.

Still no joy with Google Chrome. Others who want to view “old” sites with Chrome resort to silliness like running IE in a Chrome browser window. Whereas I’ll bow to the brilliance of the hack that makes such a thing possible, I can’t imagine why it should be necessary.

I wonder if this has been a problem for a long time, but has been hidden from me because the browser caches scripts. And, yes, I’m pretty convinced that the problem isn’t with the HTML so much as it is with scripts. If the MSDN site updated their scripts but didn’t change the version number, then pages could continue to work for a very long time.

I first experienced this problem on September 6. For several days prior to that I was doing a lot of JavaScript debugging, and had cleared the browser caches on IE and on Chrome in the process. I strongly suspect that doing so triggered this behavior. There’s no telling how long I was running with old scripts.

Update 09/16: After verifying that I could visit the site from my home computer, also running Chrome, and verifying that I was running the same version of Chrome at both sites, I had to conclude that it wasn’t the browser version. So here at the office this morning, I closed all of my Chrome windows and started up one. Then I went to Options and cleared all browsing data. Everything. Shut down Chrome, restarted it, and now I can see MSDN again.

So, if you run into this problem, Clear your browser cache! I thought I had done that. I might have cleared some of the data from the cache, but not all of it. So I just told it to delete all of the browsing data.

Demystifying Wood Grain

One issue that carvers of all skill levels continually fight with is wood grain. As beginning carvers we’re told to “carve with the grain.” Unfortunately, that advice often comes without an explanation of what “with the grain” really means. Many a beginner has understood that it means to carve parallel with the grain of the wood.

There’s more to it than that, as you’ll discover quite quickly when your knife gets drawn into the wood because you cut in the wrong direction. This can be very frustrating. I’ve ruined more than one carving by cutting in the wrong direction.

Brendant at The Old Stump blog has created a document, Demystifying Wood Grain, in which he uses some glued-up toothpicks to simulate the internal structure of wood so that he can explain and show what it means to carve with the grain. It’s a short but very informative read, which I think will benefit beginning and seasoned carvers alike.