Jim’s Random Notes

March 9th, 2010

New removable drives

Update on my removable drive troubles.

I tried drilling holes in the case (after opening it and removing the drive, of course) on one of those Seagate FreeAgent drives.  Getting the thing apart was quite a chore, and I had a fun time making a mess drilling holes in the case.  The unit tested fine afterwards when copying small files to it, but it went unresponsive after about three gigabytes of the large file.  It’s difficult to say what went wrong.  I suspect that the USB-to-SATA electronics, which were marginal to begin with, finally gave up the ghost.  At some point I’ll pull out the 1 TB Seagate drive that’s in there and see if I can use it as a normal SATA drive.

Yesterday I picked up two Antec MX-1 external drive enclosures and fitted them with 500 GB drives.  I got them installed last night, and initial results are positive.  I’ve heard that there have been some fan failures with the Antec enclosures, but a search didn’t reveal an inordinate number.  For the price (about $55 each, with tax), I might pick up a third just to keep on hand in case a fan does fail.

The drive comes with USB and eSATA cables.  I was all ready to go eSATA until I discovered that my server doesn’t appear to have a spare SATA port inside.  I suppose I could go eSATA at the office and USB at the datacenter.  I might still do that, although it’ll have to wait until I can take down that office server.  It serves other important duties here, so I can’t just shut it down without affecting a lot of other things.

In any case, I think (hope) that my removable drive troubles are over, at least for a while.

March 3rd, 2010

More removable drive troubles

I’ve mentioned before that we use USB external drives for transportation of data from our colocation facility to the office.  After struggling to find reliable devices, we finally settled on the Seagate FreeAgent 1TB drives.  They’ve served us quite well for over a year now.  But recently it’s been taking a very long time to copy our data.

It used to take about three and a half hours to copy data (a couple hundred gigabytes) from the server to the removable drive.  Recently it’s been taking on the order of 10 to 12 hours.  At first I thought it was another idiotic problem with caching, similar to the problem I had copying large files between servers, except this copy would eventually complete.  The odd thing was that when I started the copy it would proceed at the expected rate and at some point slow to a crawl.

So I wrote my own program that reads a gigabyte at a time from the local drive and then writes it to the USB device, timing each write operation.  Running locally (at the office), the program reported a steady 24 MB/sec write speed, and copied the entire file at that rate.  Run at the data center copying the same file, the program reported the same 24 MB/sec for the first 20 gigabytes or so.  Then it slowed to about 4 MB/sec.

That smacks of a thermal problem.  Either the drive electronics or the server’s USB port was overheating.  I quickly eliminated the server’s USB port as the problem by hooking up a different USB device and checking to see that the server could pass more than 50 gigabytes of data without trouble.

So the problem is with the FreeAgent drive.  If you spend a little time searching online, you’ll see that other people have experienced overheating problems with the FreeAgent drives.  And looking at the design, I can see why:  the only ventilation is at the bottom of the device where the electronics are.

drive

The picture on the left, above, shows the drive as we typically would place it in the rack at the data center.  It sits on top of one of our servers.  The spot where it’s sitting is directly above one of the disk drives.  That spot is cool to the touch when I tested it yesterday.  Note, however, that you can’t see any ventilation holes.  Those are on the other side of the enclosure, as shown by the red arrow in the picture to the right.

Since air enters the cabinet from where I was standing taking this picture, and flows towards the back, mounting the drive as shown on the left doesn’t allow for very good airflow.  So yesterday I placed the drive in the cabinet as shown on the right.  Then I ran my test program.  I was able to write about 90 gigabytes before the drive slowed down.  I’m convinced now that it’s a thermal problem.

I don’t quite know where to go from here, though.  I think the first thing I’ll try is lifting the drive higher off the surface it’s sitting on.  That should allow for better airflow, and perhaps will be enough to keep the electronics cool.  (The problem, according to what I’ve found online, appears to be the USB to SATA conversion electronics at the base of the drive enclosure.)  If changing the drive location doesn’t solve the problem, I’ll have to find a different model of removable drive that has better ventilation or better heat tolerance.  Perhaps it’s time to visit Fry’s and see about buying an enclosure that’s designed for use in the warm environment of a server rack.

December 27th, 2009

Infected!

Updated.  See below.

I don’t know how, but I somehow managed to get the Malware Defense “anti-spyware” program on my system at home.  Fortunately for me, it doesn’t do anything malicious like delete files or install botnet sofware.  It just continually pops up virus warnings and giving opportunities to install.  For a price, of course.  If you pay, they go away.

The removal instructions I came across weren’t complete, as I completed those steps, rebooted the system, and the thing came right back.  I finally tracked down and eliminated the richtx64.exe trojan, which I think is what was re-running Malware Defense.

I’ve been running my computer for years without any kind of active anti-virus or such, and this is the first time I’ve ever been infected.  Now I’m not sure what to do.  I certainly won’t go back to Norton after the troubles I’ve had with them, and I don’t hear good reports about McAfee’s offering, either.  Is there a good anti-virus, anti-malware package that works, is inexpensive, and doesn’t take inordinate amounts of CPU time?

Update 12/28:

It took a while, but with some research and downloading and running a few cleanup utilities, it looks like I was successful in disinfecting the computer.  The thing kept getting re-infected whenever I’d reboot, and it would prevent me from installing or running common anti-malware utilities.  I found a program called rkill that kills common malware processes, and then I could install and run cleanup software.  This morning, a complete scan with Malwarebytes’ Anti-Malware reported zero problems.  I then installed Microsoft Security Essentials from a file that I downloaded from a different (uninfected) computer.  It reports no problems.

Darrin Chandler brings up an interesting point in the comments:  it’s all a matter of weighing the risks.  I’ve gone years without any kind of malware problems.  Even when I had anti-malware applications installed, they never reported that they’d blocked anything.  And those programs are very quick to notify whenever they see anything even vaguely suspicious.  So, as Darrin points out, my risk of being infected is pretty small.  However, the cost of being infected is fairly high.  It cost me most of a day to get rid of it.  And I was fortunate that it doesn’t seem to have deleted any files.  I have no idea if it copied anything from me.  I’m not too worried since I don’t keep financial information on this machine.

I’m hoping that Microsoft Security Essentials works well and doesn’t cause problems by being too chatty or sucking down too many resources.  We’ll see how it goes.

October 28th, 2009

Two useful, one marginal

I recently had the need to delve into the world of JSON (Java Script Object Notation) to read some data from a particular Web site. For my purposes, the simple JSON reader provided by .NET worked just fine. The way it works is interesting: you call JsonReaderWriterFactory.CreateJsonReader, and it returns an XmlReader instance. That’s right, it converts the JSON to XML behind the scenes. Apparently there are some limitations in how it handles nested structures, but I didn’t encounter them. That’s useful thing #1.

I discovered useful thing #2 when my XmlReader threw an exception trying to parse the JSON I fed it. I originally thought that the problem was with the JSON-to-XML conversion. But then I fed the JSON to JSONLint.  It turns out that the string "It\'s an error" contains an error.  Escaping the apostrophe is an error in JSON.  There are only a handful of characters that can be legally escaped.  It’s nice to know that the site was in error and not my JSON-to-XML converter.  Either way, I still have to gracefully handle the error.

I had hoped to use the Windows command FINDSTR as a substitute for grep.  No such luck.  FINDSTR has two problems that make it marginally useful at best.  First, there’s no switch that corresponds to grep’s --only-matching (-o) option.  If you specify --only-matching, then grep outputs only the text that matches the query expression rather than outputting the entire line that contains the match.  FINDSTR lacks that option, making it useless for many of the things I do.

The other problem is very odd.  Both grep and FINDSTR are line-oriented tools.  But FINDSTR’s definition of a line is inconsistent when working with files whose lines end with just a line feed.  For example, if I’m looking for all lines that contain the text “.xml”, I’d write this:

FINDSTR /R "\.xml" file.txt

The /R switch tells FINDSTR to treat the search string as a regular expression.  I could have done a literal search in this instance, but I want to illustrate the error.  FINDSTR correctly finds and outputs all of the lines that contain the string “.xml”.

What I really want, though, is just those lines that end with “.xml”. So the command would be:

FINDSTR /R "\.xml$" file.txt

FINDSTR doesn’t find any lines that end in “.xml” unless I convert the file so that it has CR/LF line ends. grep correctly handles both line end conventions. Since I can’t guarantee the format of the files I work with (I often am working with files that I download with wget), FINDSTR is practically useless if I’m doing regular expression searches.

My advice, download GNU Grep for Windows.

October 21st, 2009

Sniffing network traffic

My latest crawler modifications require me to scrape Web pages that host videos so that I can obtain metadata (title, description, date posted, etc.) that we place in our index.  Unfortunately, there’s no standard way for sites to present such information.  ESPN and Vimeo have HTML <meta> tags that provide some info, but I have to go parsing through the body of the document to find the date.  (And yes, I’m aware that Vimeo has an API that will make this a moot point.  I’ll be investigating that soon.)

Other sites are much worse in that they provide no metadata in the HTML.  For example, one site’s video page is very code-heavy.  Requiring that the page be reloaded every time you request a new video would require a lot of network traffic.  Their design instead uses JavaScript to request a particular video’s metadata from a server.  Loading a new video involves downloading just a few kilobytes of data.

I spent some time this afternoon searching through the a video page HTML and the associated JavaScript, looking for the magic incantation that would get me the data I’m looking for.  The amount of code involved is staggering, and I quickly went crosseyed trying to decipher it before I hit on the idea of hooking up a sniffer to see if I could identify the HTTP request that gets the data.

It took me all of five minutes to download and install Free Http Sniffer, request a video from the site in question, and locate the magic line in the 230 or so requests that the page makes when it loads.  Problem solved.  Now all I have to do is write code that’ll transform a video page url into a request for the metadata, and I’m set.

I have no idea why I didn’t think of the sniffer earlier.  I’d used one before for a similar purpose.  I suspect I’ll be making heavy use of it in the near future as I expand the number of sites that we crawl for media.

September 19th, 2009

A small change?

I’ve been programming computers for a long time.  Getting paid to write computer programs, even, which I thought was pretty darned funny when I first started.  People were paying me to do something that I loved.  But I digress.

After 30 years, you’d think that I would have learned that there’s no such thing as a small change that you can push into production code without having to test.  You might get away with it from time to time, but eventually that arrogance is going to cost you.

But, hey, it’s a simple change!  What could go wrong?

When you hear yourself say that, think about what you’re saying.  And then spend the few minutes it will take to test your assumption.  If nothing else, you’ll save yourself the embarrassment of explaining to your business partner that you made the kind of mistake that you’d reprimand an employee for.

Fortunately, all it cost me was a little embassassment, a few hours’ lost sleep, and an additional hour of down time for the crawler.  I got off easy.

September 8th, 2009

Disk space is (almost) free

Today you can buy a one-terabyte Seagate drive online for $80, shipping included.  That works out to about 8 cents per gigabyte.  In August of 2003, I paid 80 bucks for a 120 GB drive:  about 67 cents per gigabyte.  If you adjust for inflation, I got eight times as much storage for about $10 less money.  According to The Inflation Calculator $80 today is about the same as $70 in 2003.

So how much is a terabyte, really?  If you’re into music, you can get one million minutes (694 days) of music on a terabyte drive, assuming a megabyte a minute (reasonable quality MP3).  VCR quality video is about 10 megabytes a minute, so you could get about 70 days of video.  DVD video is quite a bit more expensive, but you could still store 500 two-hour movies on your terabyte drive.

So what else can you do with a terabyte?  Consider:  human speech is historically recorded at 8,000 samples per second, requiring about 64 kilobits per second.  Current compression techniques can drop that to 8 kbits/sec with almost no perceptible loss in quality.  Figure one kilobyte per second.  A thousand seconds per megabyte.  A billion seconds in a terabyte.  How long is a billion seconds?  Google calculator says 31.69 years.

Imagine somebody with a voice recorder that’s always on.  Everything he says or hears is recorded and stored.  Furthermore, he has a program that can go through the recordings and create a phoneme transcript of every conversation.  That’s possible with current technology.  It’s even possible (with some errors) to identify individual speakers (i.e. Speaker 1, Speaker 2, etc.)  With a little human input, the program could identify speakers by name.  A little more work, all with technology available today, and a person could have a database of tagged transcripts containing every conversation he’s had.

I don’t know about you, but I’m a little uncomfortable with the idea that everything I say is subject to being recorded without my knowledge and reported at some point in the future.  Worse, given enough samples, a person with evil intent could easily construct a very convincing version of me saying things that I never said.  I don’t think I’m worth all that trouble, but there are plenty of people who are, and who perhaps should be somewhat concerned by the possibility.

August 27th, 2009

Looking for a Ghost Replacement

When describing the problems I was having configuring our new servers, I mentioned that I was going to try using Clonezilla to speed the process.  The idea was to get Windows installed and all the other software configured on one machine, and then just clone the drive.  Seemed like a good thing to do.

So I fired up Clonezilla, fought through the user interface to tell it what I wanted backed up and where, and then pressed the any key (really!  There was a prompt that said, “Press the any key”) to start the copy.  Clonezilla promptly told me that my network card wasn’t supported.  It would have been nice if it would have checked that when I first started the program.

Slightly discouraged but not yet willing to give up, I decided to try PING.  Another cryptic user interface, but I won’t complain too much considering the price.  This time my network card was supported and after a couple of house it had created a copy of my partition.  So I fired up the next machine, ran PING, told it to copy the partition image to the disk.  That went well, too.  Except that after I was done, the machine wouldn’t boot.  The BIOS doesn’t see a bootable image on the disk.

At that point I gave up.  I’d already spent almost a full day futzing with the things.  In that time I could have installed and configured all of the machines.  (Or so I thought.)  In any case, my experiments with free drive cloning software left me disappointed.

There’s a good overview of Ghost alternatives over at pack rat studios, but I haven’t had the opportunity to try any of the others mentioned.  Clonezilla didn’t support my hardware, and PING failed for reasons unknown.  Anybody know of a package that actually works?

By the way, telling a potential user, “if your network card isn’t supported, download it and compile it into the Clonezilla package” is not likely to be met with smiles and thanks.  More likely, users—even technically competent users like me who are capable of downloading and building—are more likely to say, “no thanks,” and move on to something else.

August 25th, 2009

A few Linux nuggets

In general, it’s a bad idea to start a file name with a dash (-). For example, a file named --help is going to give you all kinds of trouble. Say you want to rename the file. mv --help help.txt is going to show you the help for the mv command. You’ll have to give it a path name: mv ./--help help.txt.

Say you’re behind a router and you want to know your external IP address. If you have a Web browser, you can go to www.whatismyipaddress.com or one of the scads of similar sites returned by a Google search of “what is my ip address”. But from the Linux command line? The easiest one I found (i.e. the program was already installed) was using wget. The following will show you your external IP address:

wget -O - -q icanhazip.com

By the way, going to http://icanhazip.com from your browser will also tell you your IP address.

If you’re configuring a local DNS cache, you probably don’t want to include your ISP’s DNS servers in the forwarders section of named.conf.options. If you do that, then all DNS requests will be forwarded to your ISP’s DNS server. What you really want is to query the root name servers. Just leave the forwarders blank. You’ll get better performance and you won’t annoy your ISP. Here’s a properly configured forwarders section:

forwarders {
// to query the root name servers,
// don't put any IP addresses in here.
}

Running BIND on the latest Ubuntu release, the root hints file is at /etc/bind/db.root. The root servers change from time to time, so it’s a good idea to keep this file updated. You can get the latest file from ftp://ftp.internic.net/domain/. Three files appear to contain the same information: db.cache, named.cache, and named.root. You can download any one of them, copy it to your /etc/bind/ directory as db.root (after making a copy of the existing file), and then tell BIND to reload its database: sudo rndc reload.

August 17th, 2009

Software Upgrades

Firefox just notified me of the available 3.5.2 update.  I figured what the heck and told it to apply the update.  After the update was applied, Firefox restarted and then I got a notification that one of my addons, the .NET Framework Assistant, is incompatible with the new version of Firefox and has been disabled.  Truthfully, I don’t know what that addon does, but if it was something I used regularly I’d be pretty ticked off that Firefox decided to disable it.  The update software should have checked for incompatible addons and notified me before applying the update.

I spent entirely too much time last week configuring our new machines, installing Windows, configuring updates, and getting the machines installed and running at the colocation facility.  After obtaining a USB diskette drive so that I could install BIOS updates, I ran into a problem where the BIOS update program wouldn’t work.  It took a while, but I finally tracked the problem down to the fact that we got OEM machines.  That is, Dell makes them exactly like the PowerEdge 1950, but they don’t have the Dell brand on the outside.  For reasons unknown to me, you can’t install the PowerEdge BIOS on the OEM machines.  You have to find the OEM BIOS on Dell’s site.

The OEM BIOS, by the way, comes in Windows and Linux versions, diskette, and as an ISO image.  Why doesn’t Dell supply an ISO for the regular PowerEdge BIOS?

If you’re installing Windows Server 2008 or Windows Vista from the original distribution media (that is, the 1.0 distribution), do the following.

  1. Install from the original source media.
  2. Download and apply the Service Pack 2 update.  This is a big download:  over 500 megabytes for the x64 version of Windows Server 2008.  And it takes an hour or more to install.  Be sure to reboot after the update is applied.
  3. Go to Windows Update and apply all other updates, rebooting as recommended.
  4. Finish configuring your system.

If you do anything else, like applying interim updates before installing the Service Pack 2 update, or try installing roles or Windows components before applying all of the updates, you will very likely have trouble.  Trust me on this one.  It vexed me for almost two days.

I suspect that some sequences of updates end up causing incompatibilities.  I can’t prove that, since I didn’t keep track of the order in which I applied updates with the machines that went wrong.  When you think about it, it’s pretty amazing that Windows Update works as well as it does.  Whatever the problem was, I found that I can avoid it by following the procedure above.

Note to software vendors:  update notifications continue to pop up in front of whatever I’m working on at the moment.  It’s bad enough that you found a security problem in your application that requires me to update.  But to interrupt what I’m doing by bringing your silly update notification to the front is horribly bad manners and you risk me wondering why I put up with your crap at all.  Make those notifications less intrusive, dang it!

August 5th, 2009

A diskette? What’s that?

We just bought some off-lease Dell servers locally and I’m tasked with getting them set up and installed at the data center.  It’s not my favorite part of my work.  I’m at heart a programmer, and fiddling with hardware always manages to frustrate me.  Today’s encounter is particularly maddening.

We want to outfit these new servers with 32 GB of RAM each.  Since the machines only have eight RAM slots, we need 4 GB DIMMs.  I’ve mentioned before that quad-rank RAM is much cheaper than dual-rank RAM, so we go for the quad-rank parts whenever we can.  And our experience with these servers is that we can.

So I loaded one machine with 32 GB of RAM, turned it on, and it reported “No Memory.”  It turns out that these machines will support quad-rank RAM only if you have a later BIOS.  The BIOS on the machines we recently obtained is more than two years old.  But, hey, I’m okay with fiddling around a bit in order to save some money.

Now, Dell is great about making updates available on their support site, and within minutes I had downloaded the BIOS update on my workstation.  But installing the update turns out to be something of a problem.  You see, the BIOS update distribution creates a bootable FreeDOS diskette that contains the new BIOS image and the program to install it.

A diskette?  This is 2009!  Nobody even buys a server with a diskette anymore.  Hell, the Poweredge servers we bought don’t even have a place for a diskette drive!  How the hell am I supposed to install this BIOS update?  Would it be so hard for Dell to spend a little time making a bootable FreeDOS CD image that I can download?

There is another way to install the BIOS update, by the way.  Dell has Windows and Linux executable programs that will update the BIOS.  Of course, those require that your machine is running a version of Linux or Windows that Dell supports.  I find it irrational in the extreme that I have to install Windows just to update the BIOS on these machines.  If I’m really lucky, I won’t run into issues running Windows Server 2008 on a machine with an older BIOS.

I did briefly explore the idea of creating my own bootable FreeDOS CD with the required files on it.  There’s a program called FDOEMCD (FreeDOS OEM CD-ROM disc builder assistant) that supposedly will do that.  However, part of the build process is a 16-bit DOS program, which won’t run on my 64-bit Windows box.  I suppose I could put together a 32-bit XP system or a Virtual PC image, but doing that will take as much time as installing Windows.  Still, I’d sure like to explore that option one of these days when I don’t have anything more pressing to do like write rants.

And, no, I haven’t forgotten that I need to install Windows on these machines anyway in order to get everything running.  It’s just that having to install Windows first before doing the BIOS upgrade makes things a bit more inconvenient.

By the way, since I wasn’t looking forward to installing Windows five times, I’m taking a look at Clonezilla.  The idea is to install Windows once and then clone the drive image.  I’ll let you know how it goes.

July 23rd, 2009

Downloading with wget

GNU wget is a free program for retrieving files from the Web.  I freely admit that I’m just a casual user and don’t know all the things that it can do.  But it does one thing that I use quite often:  download a list of files.  It’s incredibly convenient.  All you have to do is create a text file that contains the URLs of the files you want to download, one file per line.  Then fire up wget.  Assuming the file that contains the URLs is called DownloadList.txt, then this command line will download all the files, one at a time:

wget -i DownloadList.txt

wget is available on just about any Linux distribution.  There’s also a Windows version.

Documentation is at http://www.gnu.org/software/wget/manual/.

June 22nd, 2009

Windows Explorer Wonkiness

In Windows Explorer, double-clicking on a folder name in the list pane opens that folder so you can view its files.  This is nothing new.  Over the years I have become accustomed to double-clicking and having that folder’s files appear in the list pane.  Life is good.

Well, life was good.  For reasons unknown, many of my servers now are opening a new window whenever I double-click on a folder.  This is exceedingly odd.  I have not changed any settings, and nobody else here logs in to those servers except to check on status.  Even so, I can’t imagine that they would modify Explorer’s behavior.

When it comes to that, I don’t see a setting anywhere that says whether it should open a new window or open in the same window.  The View settings in Tools->Folder Options doesn’t have a setting for this.  Oddly, if I right-click and then select “Explore” (the default), it opens in the same window, which is what it’s supposed to do when I double-click.  Selecting “Open” opens a new window.

Later:  I just installed the latest Windows Server 2008 Service Pack on one of the servers in question.  Problem gone.  Weird.

Even later: Reader Roy Harvey notes that the setting is on the General tab of Tools -> Folder Options, labeled “Open each folder in the same window.”  On my servers that still exhibit the problem, that radio button is checked.  There’s a bug in Explorer somewhere that’s fixed by installing the latest service pack.

June 17th, 2009

Facebook photo problem

So after five days working with Facebook, I’m mostly impressed.  I have a few minor nits with the user interface, but it may be that they want the UI to be somewhat mysterious.  I think they want you to explore, and if things are just a little bit wonky, you’ll be more apt to wander around blindly and stumble into things that you wouldn’t have found otherwise.

The other day I was seriously impressed with the ease of building and maintaining a photo album.  But today I’m having the weirdest problem adding a picture to an existing album.  This picture won’t upload:

bamboo

I realize that it’s not the greatest thing ever carved, but somehow I don’t think that’s the reason Facebook is rejecting it.  Their Java-based file uploader happily reports, “File upload successful!”  But the picture never appears in my album.  The “simple file uploader” fails with this message:

File bamboo.jpg: This error occurred because either the photo was a size we don’t support or there was a problem with the image file.

Granted, the size is a little bit odd (192 x 582), but there are other odd-sized files in that album.  I’ve tried resizing the picture, changing it to a .png file, renaming it, etc.  All to no avail.  Facebook simply will not accept this photo.

On a related note, Facebook is not working well with my new Firefox 3.0.11 update.  I’ve had one hang, one unexplained disappearance (Firefox just exited), and some pretty bizarre behavior.  I wonder what’s going on.

Update 2009/06/29

I played with it a bit more, resizing the image, saving it with different photo editor programs, and even changing the color as one commenter suggested in order to get past a possibly over-Freudian image filter.  Nothing worked.  I finally ended up using the entire picture that includes some background (wires and other junk on my desk) in addition to the carving.  That worked.  I don’t know why.

June 15th, 2009

Facebook

I resisted the whole social networking thing for a long time, mostly due to preconceived notions.  In the past, I was a member of many different social networks:  bulletin boards, Compuserve forums, etc.  The explosion of users that came with the rapid Internet expansion lowered barriers to entry and reduced the value of those forums.  The signal to noise ratio became so low as to make them useless.

Anyway, seeing as how we’re building social features into our product, it seems like I should get acquainted.  Since the other guys at the office are members of Facebook, as are a number of my friends, I figured I’d sign up.

So what do I think, after just a few days?  Facebook is definitely a Good Thing.  I enjoy being able to post quick updates to let people know what I’m up to, and I like seeing what’s happening in their worlds, too.  And, I’ve connected with a few people who I hadn’t heard from in 25 years or more.  I can see where Facebook or something like it could become an essential tool for keeping up with friends and family.

May 28th, 2009

Solid state storage

I still have a hard time referring to the new crop of mass storage devices as “flash drives.” The “flash” part is correct, seeing as they’re built with flash memory, but the “drive” part is just … wrong. There aren’t any moving parts. It’s like referring to “dialing” a telephone. Or the telephone “ringing.” You don’t hear that good ol’ Ma Bell … bell … anymore.

In any case, solid state storage has come a very long way in the two years since I last talked about it.  You no longer have to build your own device from parts cobbled together.  Today you can get flash “drives” in the 2.5″ form factor with capacities up to 512 gigabytes.  That’s right, half a terabyte of solid state storage.  Granted, the 512 GB units are ridiculously expensive, but the 128 GB units are pretty reasonable.  We just got one in the office for about $300, delivered.  That’s expensive compared to conventional storage ($2.35 per gigabyte compared to 10 cents per gigabyte), but it’s still an incredible deal.  It’s less than you would have paid for a 128 gigabyte hard drive five years ago.

The new crop of solid state storage devices really is worth taking a look at.  The one we got (G.Skill Falcon), claims throughput of 230 MB/sec on read and 190 MB/sec on write.  In initial tests, we were able to sustain very close to the 230 MB/sec read rate, and our sustained write rate was close to 150 MB/sec.  That’s about three times as fast as we can read a convential hard drive, and four times as fast as we can write.  We won’t be replacing all of our hard drives with these units, but we certainly can use the speed in a couple of critical I/O-bound applications.

Earlier generations of these solid state storage devices had some interesting limitations.  The first generation units were almost universally slower than or, at best, just a little faster than conventional hard drives.  Many of them also used more power than a spinning hard drive.  And some were just unreliable.  Things have improved quite a bit.  Hard drive manufacturers have gotten power consumption down to the 6 or 7 watts range, but that’s still 50% more than the 4 watts or so that the SSDs are taking.

The cost per gigabyte is huge and even a 200% performance increase doesn’t justify that price for the normal user.  But imagine you’re a developer with a laptop computer that has the typical slow laptop hard drive.  A lot of my development tools are I/O bound on my laptop.  Just try starting up Visual Studio some time.  Tripling the I/O throughput could very well greatly improve the development experience on that machine.  That would be $300 well spent.

There are other advantages of SSDs besides the performance boost, but again they won’t justify the cost increase for the average user.  The reduced power consumption mentioned above is less of a benefit than you might think because the hard drive takes relatively little power when compared to the CPU, RAM, and display.  Still, any little bit helps by reducing generated heat and increasing battery life.  Shock resistance and temperature tolerance are much higher on the SSDs, and since there are no moving parts the thing is absolutely silent.  It seems that the lack of moving parts would make the thing more reliable, too, but it’s hard to say.  I don’t know if I can believe the 500,000 hour (57  year) MTBF that hard drive manufacturers claim, much less the 1.5 million hours claimed by the SSDs.

One of my coworkers pointed out that the SSD or something similar is essential to private pilots who are flying unpressurized aircraft over 10,000 feet.  Modern avionics packages often include computers that display moving maps and download real time weather data.  That data has to be stored somewhere, and a conventional hard drive becomes unreliable at high altitude because  there isn’t enough air to float the head over the platter.  Considering what avionics packages cost, an additional $300 for an SSD wouldn’t even be noticed.

I wouldn’t recommend the SSDs for normal users, simply because of the high cost per gigabyte.  But if you need a relatively inexpensive way to increase your I/O throughput, if your computer has to run in areas that are outside a conventional hard drive’s operating environment, or if you just want to have the latest geeky toy, then by all means pick one of these up.

February 12th, 2009

Creating a slide show shouldn’t be this hard

What I want to do seems simple enough.  Given a collection of image files, create a slide show video suitable for sending via email or posting on YouTube.   There are countless GUI programs that will do this.  But I want to do it from the command line under program control.

Finding a suitable Windows program to do this is turning out to be very difficult.  I first tried a Windows build of ffmpeg, which is apparently the program under Linux.  The three Windows builds I tried are badly broken, either crashing or rejecting valid command lines.  I thought I’d try mencoder (part of the MPlayer project), but the Windows MPlayer build that I downloaded doesn’t appear to include mencoder.

Somebody else suggested x264, but my experience with that has been fruitless as well.  Unless you consider frustration fruitful.

I’m stumped.  All I want is a program that I can point to a directory and say, “Make a slide show from all those images.”  This isn’t rocket science, and yet every program I’ve tried has failed.

If you have any suggestions for a command line program that will create a slide show from multiple .jpg files and save it to a common video format (.avi, .mp4, .flv, .swf, etc.), I’d be real happy to hear about it.  But, please, don’t recommend Linux, BSD, or some other operating system.  This must be a Windows solution.

December 20th, 2008

Adventures in mass storage

We’ve been using a number of different computers as file servers here at the office, but we’re to the point now that we really need some kind of centralized data storage.  It’s one thing to have a single machine storing a few hundred gigabytes of data.  It’s something else entirely to scatter multiple terabytes across four or five machines, and then struggle to remember what’s where.

Last week we picked up two network attached storage (NAS) boxes:  a Thecus N5200, and a Thecus N7700.  The 5200 supports 5 drives and will be used primarily for offsite backup.  The 7700 supports 7 drives and will be our primary (only, hopefully) file server.

Setting these things up turned out to be quite an experience.  Not because of any problem with the Thecus boxes.  No, those things are wonderful, with very good documentation and a nice browser-based administration interface.  We had problems with the drives we bought to put in our fancy new RAID arrays.

Seagate recently released their Barracuda 7200.11, 1.5 terabyte drive.  We managed to get a great deal on the drives (about $110 each), and we picked up enough to populate the NAS boxes, plus a few high-powered machines here.

It turns out that early versions of the drive’s firmware have a bug that causes the drive to freeze and time out for minutes at a time, which in turn causes RAID controllers to think that the drive has failed.  The results aren’t pretty.  Fortunately, there’s a firmware upgrade available.  I downloaded mine from the NewEgg page for the drive.  Look on the right side about halfway down.  (This wasn’t a surprise.  We knew about the problem and about the firmware upgrade before we bought the drives.)

Applying the firmware upgrade turned into quite an experience.  You see, in order to apply the upgrade you need a DOS prompt.  Not a Windows command line prompt.  Once you manage to get a machine set up and booting FreeDOS from diskette or CD (you can’t boot from the hard drive because the firmware  upgrade program wants to see only one hard drive), you can run the firmware upgrade.  It takes less than two minutes to boot the machine and apply the update.  Then the drive is ready to go, right?

Silly me.  I upgraded the firmware on five drives, put them in the N5200, and started the thing up.  Surprisingly, the N5200 reported that the drives were 500 GB, not the 1.5 TB that I thought I had.  But the label on the drive says 1.5 TB.  Whatever could be going on?

It turns out that one of the many things you can do in the drive’s firmware is set the size.  Want to turn your terabyte drive into a 32 gigabyte drive?  No problem!  A set of utilities from Seagate called SeaTools (download the ISO and burn to CD, then boot from the CD) includes diagnostics and an interface for setting the drive’s capacity.  One option is “Set capacity to max native”.  For me, SeaTools reports that setting the drive’s capacity failed, and adds “be sure that drive has been power cycled.”  When I turn the machine off and back on, the drive reports 1500.301 gigabytes.  There’s my 1.5 terabyte drive.

After upgrading the firmware on all drives and using SeaTools to set their capacities, I finally managed to get the RAID arrays set up.  The N5200 is running RAID-5, giving us about 5.4 terabytes for our offsite backup.  The N7700 is running RAID-6, giving us about 6.5 terabytes of live data in a single place.  That should hold us for a while.

A couple of notes on the Thecus boxes:

  • Initial setup of the Thecus is a little inconvenient.  The default IP address is 192.168.1.100, so I had to cobble together a network from an old switch and hook my laptop to it.  Once I changed the IP address to fit on our subnet (10.77.76.xxx), setup went quickly.  There might be a way to change the IP address from the front panel.  If so, that would probably be easier than throwing a network together.
  • The N5200 took almost 24 hours to format and create the RAID-5 array with five 1.5 terabyte drives.  The N7700 took about 8 hours to format and create the RAID-6 array with seven 1.5 terabyte drives.  I suppose this is just the result of better hardware and firmware.
  • There must be some magic incantation to configuring the date and time settings.  If I set the date and time, and tell it not to update with an NTP server, everything works just fine.  But if I enable NTP update (manual or automatic), then the time is totally screwed up.  One box was 11.5 hours slow, and the other was a few hours fast.  (As an aside, I have another piece of equipment that insists on reporting the time in UTC, even though I’ve set the time and told it that I’m in the US Central time zone.  I’m beginning to believe that *nix-based servers don’t like me.)
  • The Thecus boxes do way more than just serve files.  We probably won’t use all those features, but others might.  I especially like the support for USB printers, and the built-in FTP server.  With DHCP enabled and machines connected to a switch off the LAN port, one of these things is a  single-box subnet.  I don’t know what kind of traffic will pass from the WAN to the LAN ports on these things, but if it’s fully blocked it’d make an effective home router to connect to a cable modem.

Anyway, we’re in the middle of copying data and retiring or re-tasking some of our old file servers.  This is going to take some time.  A gigabit network is quick until you start copying multiple terabytes. . .

December 16th, 2008

Memory Upgrades

It’s been an interesting few weeks here.  We’ve been collecting data much faster than we anticipated, and we’ve had to upgrade hardware.  One thing we’ve had to do is bring several of our servers from 16 gigabytes of RAM to 32 gigabytes.

Memory is surprisingly inexpensive.  You can buy four gigabytes of RAM (2 DIMMs of 2 gigabytes each) for $55.  That’s fine for maxing out your 32-bit machine, or you can bring a typical 64-bit machine to eight gigabytes with those parts for only $110.

Most server RAM is more expensive–about double what desktop memory costs.  In addition, most servers only have eight memory slots, making it difficult or hideously expensive to go beyond 16 gigabytes.  The reason has to do with the way that memory controllers access the memory.  Controllers (and the BIOS, it seems) have to know the layout of chips on the DIMM, and most machines are set up to use single-rank or dual-rank RAM.  A 2-gigabyte DIMM that uses single-rank RAM will have eight (or nine, if ECC) 2-gigabit chips on it.  A dual-rank DIMM will have 16 (18) 1-gigabit chips.  Dual-rank will typically be less expensive because the lower density chips cost less than half of the higher density chips.

The inexpensive 2-gigabyte DIMMs are typically dual-rank, meaning that the components on it are 1-megabit chips.  If you want a 4-gigabyte DIMM, then you have to step up to 2-gigabit chips.  And those are very expensive.  The other option is to buy quad-rank memory, which uses the 1-gigabit chips.  Quad-rank 4-gigabyte DIMMs for servers are currently going for $70 or $80.  Figure $20 per gigabyte.

The only catch is that most older computers’ memory controllers don’t support the quad-rank DIMMs.  I do know that Dell’s PowerEdge 1950 server with the latest BIOS supports quad-rank.  The Dell 490 and 690 machines do not.

If you’re in the market for a new computer that you expect to load up with RAM, you should definitely make sure that it supports quad-rank memory.  If you’re adding memory to an older machine, you might save a lot of money by doing some research to see if you can upgrade the BIOS to support quad-rank.

December 10th, 2008

Burning CDs on Windows Server 2008

Updated 2008/12/13, see below

A while back I mentioned that I was unable to burn a CD on my Windows Server 2008 box.  At that point I didn’t have time to figure out what was going on.  Today I needed to burn a CD, and had some time to fiddle with it.

On my Windows XP development box, I used ISO Recorder to burn ISO images to CD.  There’s not much to the program:  just right-click on a .iso file and select “Burn to CD” from the popup menu.  It’s simple and it works.  I wish all software worked so well.  Unfortunately, it doesn’t appear to work on my Server 2008 box.  The program doesn’t recognize my CD/DVD burner.

A bit of searching located ImgBurn, a free CD/DVD reader and recorder that works on all versions of Windows, including 64-bit versions.  This is more of a full-featured application than ISORecorder, but still quite easy to use.  It took me no time at all to start the program and tell it to burn an ISO image to CD.

I don’t know why ISORecorder won’t recognize the CD burner on my box.  The program says it works for Vista, so I imagine it should also work on Server 2008.  But it doesn’t and I don’t have the time or inclination to figure out why.  I’ve found ImgBurn, and I’m happy.

Update 2008/12/13:

I tried to burn another CD today, and ImgBurn failed to recognize the recorder.  It turns out that I have to run the program with elevated privileges (i.e. “Run as Administrator”).  I didn’t have to do that the first time, because I started ImgBurn from the installation program, which was already running as Administrator.

Also of note:  ImgBurn will not work if Windows Media Player is running.  At least, it won’t on my machine.  Media Player apparently locks or otherwise allocates the optical drive by default.  Perhaps there’s a way to turn that “feature” off, I don’t know.

I suspect that ISORecorder will work, too, if I try it in Administrator mode.  The beauty of ISORecorder is that all I have to do is right-click on a .ISO file, and it will be written to the CD.  But I don’t know how to make that program run with Administrator privileges.