Various nuggets of useful technical information.

Thursday, June 19, 2008

Fixing drives that disappear from RAID arrays on Linux

If you notice that on running cat /proc/mdstat:

some of the component drives are missing:

[root@gghbkup ~]# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0]
md0 : active raid0 hdb1[2] hde1[0] hdf1[1]
396409728 blocks 64k chunks

md1 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [UU]

md2 : active raid1 sda3[2] sdb3[1]
17406336 blocks [2/1] [_U]

Like in this case /dev/sda3 on /dev/md2, Here's what can be done:

First try re-adding it:
mdadm /dev/md2 --add /dev/sda3
If this does not work on account of a device not ready/device busy error
Then reboot the system, see if it comes back up.
Then, try fdisk /dev/sda to see if the drive is responding (use 'p' to print out the device table)
(Chances are it will still respond as evidenced by the non-failure of /dev/sda1 in /dev/md1)
If fdisk does not work, chances are the drive is dead, try replacing.
Otherwise run fsck /dev/sda3
It will check and fix any errors (or simply just bring it back up).
Then you can re-add it as shown above:

And then cat /proc/mdstat will show:

[root@gghbkup ~]# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0]
md0 : active raid0 hdb1[2] hde1[0] hdf1[1]
396409728 blocks 64k chunks

md1 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [UU]

md2 : active raid1 sda3[2] sdb3[1]
17406336 blocks [2/1] [_U]
[=======>.............] recovery = 38.1% (6636096/17406336) finish=21.0min speed=8532K/sec

unused devices:

And you're done.


No comments: