Monday, September 23, 2019

software raid - mdadm on ubuntu 10.04 - raid5 of 4 disks, one disk missing after reboot



I'm having a problem with the raid array in a server (Ubuntu 10.04).



I've got a raid5 array of 4 disks - sd[cdef], created like this:




# partition disks
parted /dev/sdc mklabel gpt
parted /dev/sdc mkpart primary ext2 1 2000GB
parted /dev/sdc set 1 raid on
# create array
mdadm --create -v --level=raid5 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1


This has been running fine for a couple of months.




I just applied system updates and rebooted, and the raid5 - /dev/md2 - didn't come back on boot. When I re-assembled it with mdadm --assemble --scan, it seems to have come up with only 3 of the member drives - sdf1 is missing. Here's what I can find:



(Side-note: md0 & md1 are raid-1 built on a couple of drives, for / and swap respectively.)



root@dwight:~# mdadm --query --detail /dev/md2
/dev/md2:
Version : 00.90
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)

Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Fri Apr 8 22:10:38 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3

Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Events : 0.140

Number Major Minor RaidDevice State

0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
2 8 65 2 active sync /dev/sde1
3 0 0 3 removed


(Yes, the server's called Dwight; I'm a The Office fan :) )



So it thinks one drive (partition really) is missing, /dev/sdf1.




root@dwight:~# mdadm --detail --scan
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=c7dbadaa:7762dbf7:beb6b904:6d3aed07
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=00.90 UUID=1784e912:d84242db:3bf6c10c:6278edbc
mdadm: md device /dev/md/d2 does not appear to be active.
ARRAY /dev/md2 level=raid5 num-devices=4 metadata=00.90 UUID=1bb282b6:fe549071:3bf6c10c:6278edbc


What, what, /dev/md/d2? What's /dev/md/d2? I didn't create that.



root@dwight:~# cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid5 sdc1[0] sde1[2] sdd1[1]
5860540224 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

md_d2 : inactive sdf1[3](S)
1953513408 blocks

md1 : active raid1 sdb2[1] sda2[0]
18657728 blocks [2/2] [UU]


md0 : active raid1 sdb1[1] sda1[0]
469725120 blocks [2/2] [UU]

unused devices:


Ditto. md_d2? sd[cde]1 Are in md2 properly, but sdf1 is missing (and seems to think it should be an array of its own?)



root@dwight:~# mdadm -v --examine /dev/sdf1
/dev/sdf1:

Magic : a92b4efc
Version : 00.90.00
UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2


Update Time : Fri Apr 8 21:40:42 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 71136469 - correct
Events : 114


Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1

0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1



...so sdf1 thinks it's part of the md2 device, is that right?



When I run that on /dev/sdc1, I get:



root@dwight:~# mdadm -v --examine /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00

UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2

Update Time : Fri Apr 8 22:50:03 2011

State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 71137458 - correct
Events : 144

Layout : left-symmetric
Chunk Size : 64K


Number Major Minor RaidDevice State
this 0 8 33 0 active sync /dev/sdc1

0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 0 0 3 faulty removed



And when I try to add sdf1 back into the /dev/md2 array, I get a busy error:



root@dwight:~# mdadm --add /dev/md2 /dev/sdf1
mdadm: Cannot open /dev/sdf1: Device or resource busy


Help! How can I add sdf1 back into the md2 array?



Thanks,





Answer



mdadm -S /dev/md_d2, then try adding sdf1.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...