Skip to content

partition table gone, data still present

I just wanted to make an USB stick bootable and wondered why mkdiskimage -4 /dev/sda 0 32 64 complained about the disk having too many cylinders. After a few moments, it ocurred to me that since libata, the system hard disk has become sda and that the stick was sdb or sdc. One ctrl-C later, fdisk confirmed both counts: That I accidentally started mkdiskimaging my main system hard disk and that the partition table was already gone.

A few hours later, the notebook is back in business without too much data loss. Lucky me.

My notebook has an "alibi" installation of Windows XP in its first primary partition sized 4 GB. Since I aborted the mkdiskimage process early enough, I hoped that only the windows installation was hosed. The Linux system is entirely in LVM (with crypted LVs), and LVM hadn't noticed yet that its PV was gone. I reckoned that I still had a chance to rescue my data if I didn't reboot.

So I hastily commandeered an USB hard disk (thanks, $CUSTOMER, for quickly supplying one), fdisked, pvcreated and vgextended it and started the pvmove. While this was happening, I theoretically was able to continue working, which I didn't because I was afraid of losing. About half an hour into the copy process, the USB disk chose to deregister itself and the pvmove aborted. Rebooting the disk made it reappear as sde (it was sdd before), and miraculously, a new pvmove call just worked. However, a good fraction of actions on the shell resulted in a lot of input/output errors on the screen. I guessed that this was the first sign of my data finally dying, but I let the pvmove continue.

After the pvmove was finished (which was rather fast since the move of the "crypted LV" means that the crypted data is moved and not de- and recrypted), I had to recreate a /dev/sda2 partition with type 8e before the LVM tools allowed me to vgreduce the VG.

I then zeroed out the first 10 GB of the internal disk and spent a few hours reinstalling and patching XP from scratch before I finally went through the pvcreate, vgextend, pvmove routine in the other direction. Then the big moment arrived: fsck of all LVs.

I guess that the reason for the input/output errors was that /usr was completely hosed (no single file outside lost+found), but /home and all other file systems holding data where the way back to the (a week old) backup would have been quite painful were ok. I pulled /, /usr and /var from the backup to be consistent again, updated to current sid (avoiding KDE 4 for the time being) and could start working again.

The only real loss was a few hours of work that were stored in /usr/local which died along with /usr, so the result of my stupidity was not very catastrophic but still a good lesson. And I have learned that LVM is a robust and resilient beast. Good work. Now I'd like to know what happened to my /usr - I always had the impression that pvmove should not break data even when the target disk suddenly goes away...

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

glandium on :

You could have tried gpart first, though I don't know if it can find LUKS partitions.

kju on :

In the text he says that the logical volumes are encrypted, so the partition itself (the physical volume) is probably not. And gpart can detect LVM physical volumes. So gpart should have worked.

Marc 'Zugschlus' Haber on :

Probably. I just wasn't in the mood for experiments though. I could have done a full backup before trying gpart, but that would probably have taken much more time due to the crypto operations necessary.

Add Comment

Markdown format allowed
Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
Form options