Monthly Archives: April 2007

You want it when?

The web crawler I’m working on, as I’ve mentioned before, is a distributed application. Currently it consists of a URL Server and multiple Crawlers. The basic idea is that the URL Server is a traffic director that tells each Crawler … Continue reading

Posted in Programming, Web Crawling | Comments Off

Credit Card Fraud

Debra called while I was on my way to lunch this afternoon. Somebody purporting to be the Capital One fraud department had left a message on the home answering machine saying that it was imperative that I contact them. They … Continue reading

Posted in Finance | 10 Comments

Build your own notebook flash drive

Further to yesterday’s entry about notebook flash drives, I forgot to mention the Addonics CF drive adapter. For $30, you get an adapter that plugs into your notebook’s IDE connector. Add a 16 GB compact flash card, and you have … Continue reading

Posted in Computers | Comments Off

Odds ‘n Ends

A few items that have been gathering dust here while I bang away on the crawler. Ever wonder what you could do with a terabyte of really fast storage? Check out the Tera-RamSan. I hope you have a big budget, … Continue reading

Posted in Odds 'n Ends | Comments Off

Computers Update

After a little more than three weeks with the new computer, I’m mostly happy with it. It’s blindingly fast, both in processing and in disk access. And whisper quiet, really. I got into the office very early the other day–about … Continue reading

Posted in Computers | Comments Off

Bloom Filters in C#

As I’ve pointed out before, writing a Web crawler is conceptually simple: read a page, extract the links, and then go visit those links. Lather, rinse, repeat. But it gets complicated in a hurry. The first thing that comes to … Continue reading

Posted in Programming, Web Crawling | Comments Off

A new file system primitive?

A problem I was working on recently got me to wishing that I could lop off the front of a file. Kind of like a “truncate at front,” if you will. Truncating a file at the back end is a … Continue reading

Posted in Programming | Comments Off