Murphy’s Law Eats My Weekend

1:22 am digital

This server has, for the past 8 months or so, been running off Fedora Core 3. When Fedora 4 shipped in June I added “upgrade server” to my to-do list. And there it languished.

Last month while visiting a friend the server locked up pretty hard and the drive just sat there spinning. Woo called me to tell me, and I had her hard reset the machine. I became worried about the status of the drive, but apparently not worried enough to actually do anything about it. Bad idea.

Saturday night as I sat in our home office room, the drive started grinding. Not the “death noise” but consistent read/write activity. I could not ssh in. So I ran to the other end of the house, grabbed the spare monitor, and had a look. Lots of errors concerning SELinux vs. ntp. Was it software, or the drive? Who knows!? It’s a crapshoot! Whee!

I reset the machine and waited for it to come back up, which it did. I ran some simple tasks, which all worked. I then asked rsync to sync my mp3 library (which is quite large) to the server. It has done this before, so it only needed to transfer a few songs. But rsync generates ls style file lists, so it is quite an intensive read process regardless of the amount of data that is out of sync.

Boom. Death. The machine … just … froze.

OK, so it wasn’t SELinux vs. ntp. It was the drive. I shut the machine down and powered it back up to ensure that the drive hadn’t died, but was just dying. Sure enough (and thank Jebus) the drive was only on its deathbed, not in the grave. Having travelled this road before I knew enough to shut it down, throw a placeholder page on an alternate apache installation and not touch that drive except to take data off it when a replacement was ready.

I have a spare 250GB parallel ATA drive in my arsenal. I resolved to install CentOS on it. First speedbump. My CentOS/i386 Install Disk 1 was damaged last month, and I had never burned a new copy. And in a fit of insanity I had tossed the ISOs I had downloaded. Duh. So Saturday night as I slept I curl‘ed that image. Again.

Second speedbump. Awake yesterday morning, burn the image, and find it is corrupt. Spent another 2 hours downloading it again (again). This image worked, and I installed CentOS onto the 250GB drive.

Third speedbump. Go to boot CentOS for the first time and only get a GRUB prompt. Try defining root and am presented with the lovely GRUB Error 18 message. The drive is too large for the BIOS in this PIII-550 to recognize, and CentOS has placed /boot outside the range of the BIOS’ view. CentOS can install fine, as Linux does not use the BIOS to get drive information … except during boot. Le sigh.

So now I have 4 options.

  • Re-install CentOS making a dedicated /boot partition within the BIOS’ line of sight.
  • Pass funky kernel parameters at boot to overcome the problem.
  • Buy a drive less than 128GB. Or several if i want more storage and use LVM.
  • Get a controller card and bypass the BIOS completely.

Now really the only two sane options are the first and last. Passing kernel params is a major kludge. Buying (perhaps multiple) 120GB drives is cost-inefficient. So I sat contemplating the best approach. I soon realized that the 250GB drive I have is getting old, and has been used quite a bit. In fact, this is the drive that the Mac caused to poop itself a few months ago. I was setting myself up to be right back in this same position in weeks or months, knowing Murphy’s Law. And knowing my luck thus far on this issue …

So off I went to Fry’s. When looking at controller cards I realized I could get a Serial ATA card for the same price as a Parallel ATA card. And SATA is a LOT faster. Not only that, but factoring in a rebate I could get a Seagate 300GB SATA drive for US$119. At least one decision was simple. I came home with a SIIG SATA controller card and that Seagate.

Now, the SATA drive in my desktop is only 250GB. No way I’m getting a 300GB drive and not using it in my desktop. So after a dry run with CentOS and the 300GB drive to ensure the controller card works, I spent a couple hours migrating my desktop from the 250GB to the 300GB. Let me tell you, rsync’ing between 2 local 7200rpm SATA drives is a thing of beauty. Incredibly fast. Jizz-inducing fast. Yum-MY!

So, at around 11pm Sunday night I was where I thought I was going to be at 11pm Saturday night. 24 hours spent spinning my wheels. Great. I finally got CentOS installed, transfered data from the old 120GB drive and began configuration. The server is now about 90% done, and web services have (obviously) been restored. And we’re now traveling on a 250GB 7200rpm Western Digital SATA drive connected to a processor-independent controller card. Whoopee!

My apologies to those that waited for the server to come back online, and my thanks to those folks who are hosted here. Hopefully this will be the last such glitch for some time. And enjoy the speed improvement. I’m off to do that last 10% of configuration.

Comments are closed.