For some reason, this article written in late 2008 remained unpublished. It's therefore kind of outdated.
When I took over my former work notebook (an hp nc 8000) from my (now former) company, one of the first things I did was swapping its old 120 GB disk for a new 250 GB disk. 250 GB is the biggest disk one can get in the 2.5 form factor with a PATA interface, and there is only one disk in the market, and it's made by WD. So I didn't have much of a choice and ordered one in mid August 2008. It has been working fine until it died this Friday, a mere three months after buying it. This wrecked much of my Friday and the entire weekend for me since I spent the days being a data wrangler, and without my main work tool.
This disk death was the second one this week after having one 40 GB disk (purchased in 2001) die in my other notebook on monday. I've really had it with hardware for the time being.
The first sign of trouble was the notebook not waking properly waking up from suspend-to-disk on Friday morning; the state was yet reached normally on Thursday night. I shrugged, did a hard reset and booted the device. This worked fine, and I started working until strange noises emerged from the notebook, which prompted me to quickly do a backup, which thankfully completed kind of successfully. During this backup, the 500 GB disk I use for backups complained about file system errors, which earned me an extra two hours waiting for the 500 GB fsck, fortunately, with a successful result.
I then called up Alternate, the vendor where I bought the disk a quarter of a year before, and asked them what to do with the not-yet-dead disk. Unfortunately, they didn't prove helpful and gave me the choice of either sending in the disk directly to WD, which would take at least a week, and of sending it in to them for replacement, which would take at least a month. My offer to order and pay for a new disk now and get appropriate credit for the price paid for the new disk after the bad disk was replaced was rejected. Sucks. Service is something different.
I then proceeded to order a new disk from a different vendor, AV-Electronix, who have a really cool order monitoring tool. Thirty minutes after I placed my order, I was informed that the disk was commissioned, and it was indeed delivered not even 24 hours after placing the order. Impressive. This is the first time I have seen a real-time order monitoring tool actually work.
After the backup was done, I tried ruling out the notebook itself as a source of error (as the backup disk had been acting up as well) and gave it a single memtest86 run, fortunately without bad results.
Then I started WD's warranty procedures, which prove to be a major headache. Like most disk vendors, they offer a tool (theirs is called Data Lifeguard Diagnostics) to determine the drive fitness. For some reason, I missed them offering a bootable DOS .iso image with their Diagnostics tool and ended up installing Windows XP on a new spare hard disk. After putting the old disk into the MultiBay adaptor, the Windows-based Diagnostics quickly came up with the verdict "fail", and I ordered an advance replacement. WD's web form just offers 30 characters for the failure reason, so they only got a "DLG extended test fail". A RMA number was quickly issued, and we'll see when the replacement disk (which will boldly go into the Dreambox, replacing the 160 GB unit there) will arrive.
The next day, the ordered disk arrived, and I spent the rest of the day with a badblocks -vvw on the new disk. After eight hours, I calculated that the badblocks scan won't finish before monday noon, and thankfully found out that my grml at some time into the diagnostics fell back to PIO mode and disabled DMA. After re-enabling DMA, the speed greatly increased, and the badblocks scan finished at sunday noon: No bad blocks found, as expected for a brand new disk.
While the badblocks scan executed, I found the bootable .iso of the DLG tool and re-executed it on the old disk ("fail") and the new disk after the badblocks scan finished. I had to leave for Tango Argentino Class while the test was running on the old disk.
The rest was easy, thanks to LVM: Have new disk in its final place, old disk in the multibay, boot grml, partition new disk, create PV, expand VG, pvmove, reduce VG, remove old disk, send in old disk for replacement.