Replacing a router with IPCop

For reasons that will become clear if you read further, I have three pieces of advice:

  1. Do not expect a home cable/DSL router like the LinkSys WRT54G to work at capacity. It’s fine for a little Web surfing and some infrequent large downloads, but it’s going to fail intermittently at high duty cycles.
  2. If you’re thinking about installing an IPCop firewall, be sure to check the Hardware Compatibility List before you go out and buy network cards.
  3. Reset the cable modem.

A home cable/DSL router will work fine for general Web surfing, streaming media, downloading a file or two now and then, and for playing online games, but if you continuously run the thing at high bandwidth for long periods, it’s going to fail intermittently. It’s trivial to saturate a cable modem with my Web crawler. We were running 700 to 800 Kbps for hours on end, and the LinkSys router would just stop responding after a while. The internal network would remain up, but the external interface would go dead. That was okay when I was having trouble keeping my crawler running for more than a few hours at a time, but once I got it running reliably it was very annoying to have the router go down.

Somebody recommended that I install an IPCop firewall. So I pulled an old box (1 GHz AMD processor with 1 GB of RAM) out of the closet, downloaded the IPCop software, and tripped down to Fry’s for a few gigabit Ethernet cards since we decided to upgrade the internal network in the process. I’m a reasonably bright guy. How hard can it be? Right?

I forgot that I was dealing with Linux. And not a modern Linux distribution, but rather a very limited custom version whose install is not quite fully baked. The first thing I learned is that IPCop doesn’t support every network card in the world like Windows and the more general Linux distributions do. My own fault, really. I should have checked the hardware compatibility list.

I like Linux. Really. And IPCop is an incredibly well done piece of software. Once you get it installed. It’s kind of disappointing that the installation instructions are so detailed, but also somewhat cryptic in places. In particular, the instructions don’t tell you that when the install probes for network cards, it’s probing for a network card–not all of the network cards. And the probe identifies the chip manufacturer rather than the board manufacturer. It’s a bit disconcerting when you have a LinkSys card and an Intel card in the machine, and the probe says that it found a DEC Tulip something or other.

I struggled with it for a while (par for anything new I do with Linux) and finally got it working in the test environment–with the IPCop RED interface hooked to the router, and the GREEN interface connected to a switch with another computer–simulating the final installation. It all worked great, so I plugged the cable modem into the IPCop box.

IPCop couldn’t see the RED interface. So I re-read the documentation, checked all the settings two or three times, and then sat back to scratch my head. Between putting out other fires and scouring the Internet looking for the answer, it was late afternoon before I stumbled onto the answer: reset the cable modem.

Apparently, the cable modem registers the MAC address of the first device it sees, and only that MAC address can directly access the modem without resetting–turning off the power and waiting a few minutes. I’m not sure why that’s the case, but it appears to be. I powered down the modem, waited a few minutes, and then brought everything back up. Success!

It’s now a little after midnight. The IPCop machine has been up for almost 8 hours and the crawler has been banging on it continuously. The firewall hasn’t burped, and we’re getting better throughput now than we did with the flaky router. So far, I’m highly impressed. I’ll have more to say about IPCop after I’ve worked with it for a few days.

MySQL Server has gone away

I got an error message from MySQL last week that said, “The MySQL server has gone away.” I thought it an odd message, but since I was changing settings, configuring a driver, and experimenting with my program, I figured that my fiddling had somehow caused me to execute a query against a closed connection. So I ignored the error and went about my business. Since the error didn’t crop up again, I forgot all about it.

The error came back yesterday and camped on my front door. A quick search of the MySQL Reference Manual revealed the topic MySQL server has gone away in an appendix. Here are some of the things that can cause this error message:

  • The server timed out and closed the connection.
  • Attempting to to execute a query against a closed connection.
  • You don’t have sufficient privileges to execute the query.
  • The server, the TCP connection, the Windows client, or something else timed out.
  • Sending a query that is too large.
  • Attempting to execute an incorrect query.
  • You have mismatched server and client versions.
  • Attempting to use the same connection from multiple threads.
  • A blocked port on the firewall.
  • A bug in the server.

Helpful, isn’t it? For all the good it does me in locating the problem, the message might just as well have said, “Something bad happened.” I spent an inordinate amount of time checking timeout values and permissions, checking my query for validity, verifying client versions, making sure that my multi-threaded program wasn’t trying to re-use a connection, and finally writing diagnostic code to check the length of my queries against the maximum allowed length. That was the problem, by the way: a very large query.

This is asinine. Why couldn’t MySQL tell me the exact error message? It looks to me like some lazy programmer decided to save himself time at my (and others’) expense.

On a related note, the MySQL ODBC driver for Windows is a huge disappointment. First, there’s no 64-bit version, forcing me to make my data broker a 32-bit program. I’d be more understanding if this was 2005, but 64-bit versions of Windows have been out for at least two years. Isn’t it about time somebody upgraded that ODBC driver?

The ODBC driver also has a very nasty UTF-8 bug that is being ignored by the developers.

I’ve heard good things about MySQL, but I’m not yet favorably impressed. I can’t argue with the price, though, and for all the headaches it does seem reliable. I haven’t lost any data. Yet?

Slow Buffalo Linkstation

I bought a Buffalo LinkStation last summer and installed it on the office network. David and I used it a bit for a while, but then our needs changed and it mostly sat there idle. I’d back up the odd file to it from time to time, but that was about it. Then one day I noticed that it was unbelievably slow copying files. It would copy the first few megabytes no problem, but then it’d slow to a crawl. It took most of a day to copy a few gigabytes of data. So I stopped using the drive altogether.

I set it up on the home network this morning, hoping that maybe it just didn’t like the office network. No such luck. Still abysmally slow. So I did what I should have done last fall: see what Google has to say about it. And sure enough, the first search hit in response to “buffalo linkstation slow” returned the Slow Buffalo LinkStation blog post over at ManCave. Although his suggestion of checking the network speed didn’t pan out, several of the other posters suggested disabling the print function and clearing the print queue. Sure enough, doing that solved the problem and now I have a place on the network where Debra and I can store shared files.

On a related note, 250 gigabytes doesn’t seem near as vast today as it did when I bought that LinkStation for $250 or more last summer. Yesterday I paid $120 for a 500 gigabyte Maxtor external USB drive. 24 cents per gigabyte. Storage is essentially free.

Fast, Neat, Average

form_o-96.jpg

It’s frightening sometimes what the brain retains.

One of the many make-work jobs a fourth class cadet at the Air Force Academy has to endure is filling out the Form O-96 (that’s “Form oh dash nine six”) at the end of every meal in Mitchell Hall. Ostensibly, the form is there so that cadets can rate the food and the meal service, and also so that fourthclassmen can be indoctrinated in the proper way to fill out an official government form. Like most things military- or government-related, it becomes a ritual that serves little real purpose but is continued because that’s the way it’s always been done. I suspect there’s one person in Mitchell Hall whose sole job is to file the forms. Whether they read them and make recommendations based on the contents is an open question.

The other night I wanted to kick my feet up on the desk and work on the laptop. Since the laptop gets uncomfortably warm sitting directly on my legs, I typically pull a book off the shelf and place the laptop on top of it. The book I pulled off the shelf Monday was my 1980 Basic Cadet Training “yearbook” that I hadn’t looked at in years. I started flipping through the pages, seeing old familiar faces and recalling the different parts of training. On the page about Mitchell Hall, there was a large heading: “Fast, Neat, Average.” My brain immediately supplied the rest: “Friendly, Good, Good.”

You see, the Form O-96 has six multiple-choice questions and two comments boxes. “Fast, Neat, Average, Friendly, Good, Good” were the standard boxes to check if there were no comments from the upperclassmen.

I hadn’t thought of the Form O-96 or “Fast, Neat, Average” for at least 20 years, and yet the proper response popped into my head immediately when I saw that line in the book. I wonder what else is lurking in my head, ready to spring out and surprise me at the least opportune moment.

Documentation headaches

You’d think that published documentation would be checked for correctness.

I’m just getting started writing the code that parses files that use Microsoft’s Advanced Systems Format (ASF), and I ran across a rather confusing bit of documentation.

The Header Extension Object contains 46 bytes of header information and then a variable-length byte array of additional data. In the description of this object, there are these two tidbits:

Object Size – Specifies the size, in bytes, of the Header Extension Object. The value of this field shall be set to 46 bytes.

Header Extension Data Size – Specifies the number of bytes stored in the Header Extension Data field. This value may be 0 bytes or 24 bytes and larger. It should also be equal to the Object Size field minus 46 bytes.

This is plainly impossible. If Object Size must be 46, then Header Extension Data Size has to be zero? But then you can’t have any data. Odd, that.

I’ll have to examine a file that contains one of these objects to be sure, but I suspect the Object Size field will contain the size of the entire record and Header Extension Data Size will indeed be (Object Size) – 46.

I can understand making a mistake in documentation. I’ve certainly made my share. But I can’t be the first person to come across this error since the ASF documentation was last revised in December of 2004. Why hasn’t Microsoft updated their document?

What kind of music is that?

My long-term work project involves reading music and video files to extract metadata–text information that is embedded in the file. A .MP3 music file, for example, often has a lot of information in it: the song title, artist’s name, publisher, musician credits, and even lyrics. One of the metadata tagging standards includes a field for “genre”, which can take one of 126 different values:

0.Blues
1.Classic Rock
2.Country
3.Dance
4.Disco
5.Funk
6.Grunge
7.Hip-Hop
8.Jazz
9.Metal
10.New Age
11.Oldies
12.Other
13.Pop
14.R&B
15.Rap
16.Reggae
17.Rock
18.Techno
19.Industrial
20.Alternative
21.Ska
22.Death Metal
23.Pranks
24.Soundtrack
25.Euro-Techno
26.Ambient
27.Trip-Hop
28.Vocal
29.Jazz+Funk
30.Fusion
31.Trance
32.Classical
33.Instrumental
34.Acid
35.House
36.Game
37.Sound Clip
38.Gospel
39.Noise
40.AlternRock
41.Bass
42.Soul
43.Punk
44.Space
45.Meditative
46.Instrumental Pop
47.Instrumental Rock
48.Ethnic
49.Gothic
50.Darkwave
51.Techno-Industrial
52.Electronic
53.Pop-Folk
54.Eurodance
55.Dream
56.Southern Rock
57.Comedy
58.Cult
59.Gangsta
60.Top 40
61.Christian Rap
62.Pop/Funk
63.Jungle
64.Native American
65.Cabaret
66.New Wave
67.Psychadelic
68.Rave
69.Showtunes
70.Trailer
71.Lo-Fi
72.Tribal
73.Acid Punk
74.Acid Jazz
75.Polka
76.Retro
77.Musical
78.Rock & Roll
79.Hard Rock
80.Folk
81.Folk-Rock
82.National Folk
83.Swing
84.Fast Fusion
85.Bebob
86.Latin
87.Revival
88.Celtic
89.Bluegrass
90.Avantgarde
91.Gothic Rock
92.Progressive Rock
93.Psychedelic Rock
94.Symphonic Rock
95.Slow Rock
96.Big Band
97.Chorus
98.Easy Listening
99.Acoustic
100.Humour
101.Speech
102.Chanson
103.Opera
104.Chamber Music
105.Sonata
106.Symphony
107.Booty Bass
108.Primus
109.Porn Groove
110.Satire
111.Slow Jam
112.Club
113.Tango
114.Samba
115.Folklore
116.Ballad
117.Power Ballad
118.Rhythmic Soul
119.Freestyle
120.Duet
121.Punk Rock
122.Drum Solo
123.A capella
124.Euro-House
125.Dance Hall

Looking at the list, I got to wondering what some of that music is. And I also got to wondering if people really know enough to classify their music correctly. For example, please give me a brief explanation of the differences between Classic Rock, Rock, Alternative Rock, Instrumental Rock, Southern Rock, Rock & Roll, Hard Rock, Folk-Rock, Gothic Rock, Progressive Rock, Psychedelic Rock, Symphonic Rock, Slow Rock, and Punk Rock. Why is “Acid” not “Acid Rock?” What’s the difference between Gothic and Gothic Rock?

And then there’s Industrial, Techno, Euro-Techno, and Techno-Industrial. The mind fairly boggles. Is Psychadelic (hey, they misspelled it, not me!) somehow different than Psychedelic Rock? And what, pray tell, is Trailer? I picture Billy Joe Bob sitting outside his trailer in a grungy t-shirt, holding a can of Keystone Light. What kind of music is he listening to?

Jungle? Is Jungle different than Tribal? I guess Trance is what hypnotists use? That’s somehow different, I suppose, than Meditative, which apparently people use to put themselves into a trance?

How many people can accurately differentiate between Disco, Soul, Funk, Pop/Funk, and Rhythmic Soul? And what the heck is a Power Ballad? Is that somebody serenading his electric generator?

I especially like #39: Noise. We could probably halve the genre list by reclassifying things like Disco, Death Metal, Gothic, Bebob, Avantgarde, … oh, forget it. Just make everything “Noise” and be done with it. No matter what kind of music you have, somebody is going to call it noise.

Interesting that there’s no “Baroque” classification. Is that because nobody listens to Baroquen records?

I doubt that even music publishers can agree on what “genre” many types of music fit into. And some music fits into many of these categories. A song could be Rap and Top 40, for example. It’s pretty obvious that this list of genres was intended for people to use in their own music collections, the idea being that the person would know what he means when he calls something “Lo-Fi” (whatever the hell that is). But when these private collections start escaping onto the Internet where the whole world can see them, confusion ensues.

Excuse me now while I go Rave to a Fusion of Booty Bass and Porn Groove.

Paranoid license agreements

Part of my project involves extracting metadata from media files. .MP3 music files, for example, often contain the song title, album, artists, and other information (including lyrics, sometimes). Not all media formats allow for metadata, but the major ones do, including Microsoft’s Advanced Systems Format (ASF) that is used by the .WMA and .WMV (Windows Media Audio/Video) files.

You can download a copy of the ASF Specification from Microsoft’s Web site. It’s a 100-page Word document that (I think) fully describes the format. From it, you should be able to write a program that will read and write files that use ASF. If you’re writing a Windows program, you can also use the Windows Media SDKs, but that’s not an option if you’re trying to write a program that will run on other operating systems.

Surprisingly (or perhaps not so surprisingly, considering the source), the specification document contains a three-page End User License Agreement to which you implicitly agree by “downloading, copying, or otherwise using the Specification.” Fine, right? It’s only a specification. What kind of silliness could they possibly put in there? But, seeing as how this is going to form a critical part of my project, I figured I’d better read it.

Under the heading DESCRIPTION OF ADDITIONAL LIMITATIONS, I found the following:

You may not provide, publish or otherwise distribute the Specification to any third party. Further, you shall use commercially reasonable efforts to ensure that the use or distribution of your Solutions, including your Implementations as incorporated into your Solutions, shall not in any way disclose or reveal the information contained in the Specification.

If I read that correctly, it’s saying that I cannot reveal the source code of my implementation. So much for open source .WMA/.WMV readers.

If there’s any doubt, consider this further restriction:

For a variety of reasons, including without limitation, because you do not have the right to sublicense the Necessary Claims, your license rights to the Specification are conditioned upon your not creating or distributing your Implementations in any manner that would cause ASF (whether embodied in your Implementation or otherwise) to become subject to any of the terms of an Excluded License. An “Excluded License” is any license that requires as a condition of use, modification and/or distribution of software subject to the Excluded License, that such software or other software combined and/or distributed with such software be (x) disclosed or distributed in source code form; (y) licensed for the purpose of making derivative works; or (z) redistributable at no charge;

Gosh, could they be talking about the GPL?

For reasons I’d rather not get into here, I’m not a huge fan of the GPL and I can understand (although not completely agree with) somebody saying, “You can’t use my stuff in GPL programs.” But the other restriction–preventing me from revealing source code that implements a specification that is available to anybody without restriction–is, at best, silly. Most reasonable people would call it paranoid. With that license agreement, Microsoft has completely eliminated the threat of open source implementations supporting their .WMA and .WMV formats. But they’ve also eliminated a huge potential audience. I’m surprised that providers don’t see this and refuse to provide Windows Media versions of their content.

The restrictions don’t particularly affect me or my project, as we’re not going to be distributing source code, but I can’t imagine why Microsoft included this restriction. Do such restrictions exist for their other specifications?

Norton Antivirus

I’d never used anti-virus software on my personal computer until I bought this notebook two years ago. I’d never had trouble with worms, viruses, trojans, or other malware, but people I trust and respect convinced me that I was just lucky–that things could get through my Linksys firewall, Windows firewall, and infect my machine. So I picked up a copy of Norton Antivirus at Fry’s and installed it.

I was planning to renew online this year, but I was at Fry’s the other day and noticed something interesting: Norton AntiVirus, 3-user license, on sale for $49.99. And there’s a $50.00 mail-in rebate! So if I buy that and mail in the rebate form, I end up getting the software for about $4.25 (the sales tax). Such a deal!

Symantec did a very good job with the install program, except for one thing. It does a preliminary system scan and then needs to reboot. Here’s the notification:

Who edits this stuff? First it was Apple’s idiotic insistence that users shouldn’t have to answer “Yes” or “No.” That gave us prompts with questions and two buttons: “OK” and “Cancel.” Now we have the opposite problem: “Yes” and “No” options with no question. It is to cry.

I installed Norton on Monday night. Last night I was doing some writing on the machine and it was terribly slow. I opened Task Manager and found that a program, appsvc32.exe, was consistently chewing up from 60 to 100 percent of the processor time. A quick search online reveals that appsvc32.exe is part of Norton AntiVirus, and that this is a known problem. The following seems to solve the problem while Symantec figures out how to fix it properly:

  1. Bring up the Norton Protection Center.
  2. Click on the Norton AntiVirus tab.
  3. Click on the Settings bar to expand the settings menu.
  4. Click on Auto-Protect, and then the Configure button that pops up.
  5. On the left side of the dialog box that pops up, click on the General Settings link.
  6. Clear the “Scan active programs and start-up files” checkbox, as shown here.