Thread: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
"Ow Mun Heng"
Date:
Sorry guys, I know this is very off-track for this list, but google hasn't
been of much help. This is my raid array on which my PG data resides.

I have a 4 disk Raid10 array running on linux MD raid.
Sda / sdb / sdc / sdd

One fine day, 2 of the drives just suddenly decide to die on me. (sda and
sdd)

I've tried multiple methods to try to determine if I can get them back
online.

1) replace sda w/ fresh drive and resync - Failed
2) replace sdd w/ fresh drive and resync - Failed
3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed


Raid10 is supposed to be able to withstand up to 2 drive failures if the
failures are from different sides of the mirror.

Right now, I'm not sure which drive belongs to which. How do I determine
that? Does it depend on the output of /prod/mdstat and in that order?

Thanks


Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Scott Marlowe
Date:
On Tue, Oct 20, 2009 at 1:11 AM, Ow Mun Heng <ow.mun.heng@wdc.com> wrote:
> Sorry guys, I know this is very off-track for this list, but google hasn't
> been of much help. This is my raid array on which my PG data resides.
>
> I have a 4 disk Raid10 array running on linux MD raid.
> Sda / sdb / sdc / sdd
>
> One fine day, 2 of the drives just suddenly decide to die on me. (sda and
> sdd)
>
> I've tried multiple methods to try to determine if I can get them back
> online.
>
> 1) replace sda w/ fresh drive and resync - Failed
> 2) replace sdd w/ fresh drive and resync - Failed
> 3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
> 4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed
>
>
> Raid10 is supposed to be able to withstand up to 2 drive failures if the
> failures are from different sides of the mirror.
>
> Right now, I'm not sure which drive belongs to which. How do I determine
> that? Does it depend on the output of /prod/mdstat and in that order?

Is this software raid in linux?  What does

cat /proc/mdstat

say?

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Craig Ringer
Date:
On 20/10/2009 4:41 PM, Scott Marlowe wrote:

>> I have a 4 disk Raid10 array running on linux MD raid.
>> Sda / sdb / sdc / sdd
>>
>> One fine day, 2 of the drives just suddenly decide to die on me. (sda and
>> sdd)
>>
>> I've tried multiple methods to try to determine if I can get them back
>> online

You made an exact image of each drive onto new, spare drives with `dd'
or a similar disk imaging tool before trying ANYTHING, right?

Otherwise, you may well have made things worse,  particularly since
you've tried to resync the array. Even if the data was recoverable
before, it might not be now.



How, exactly, have the drives failed? Are they totally dead, so that the
BIOS / disk controller don't even see them? Can the partition tables be
read? Does 'file -s /dev/sda' report any output? What's the output of:

smartctl -d ata -a /dev/sda

(repeat for sdd)

?



If the problem is just a few bad sectors, you can usually just
force-re-add the drives into the array and then copy the array contents
to another drive either at a low level (with dd_rescue) or at a file
system level.

If the problem is one or more totally fried drives, where the drive is
totally inaccessible or most of the data is hopelessly corrupt /
unreadable, then you're in a lot more trouble. RAID 10 effectively
stripes the data across the mirrored pairs, so if you lose a whole
mirrored pair you've lost half the stripes. It's not that different from
running paper through a shredder, discarding half the shreds, and lining
it all back up.


On a side note: I'm personally increasingly annoyed with the tendency of
RAID controllers (and s/w raid implementations) to treat disks with
unrepairable bad sectors as dead and fail them out of the array. That's
OK if you have a hot spare and no other drive fails during rebuild, but
it's just not good enough if failing that drive would result in the
array going into failed state. Rather than failing a drive and as a
result rendering the whole array unreadable in such situations, it
should mark the drive defective, set the array to read-only, and start
screaming for help. Way too much data gets murdered by RAID
implementations removing mildly faulty drives from already-degraded
arrays instead of just going read-only.

--
Craig Ringer

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Greg Smith
Date:
On Tue, 20 Oct 2009, Ow Mun Heng wrote:

> Raid10 is supposed to be able to withstand up to 2 drive failures if the
> failures are from different sides of the mirror.  Right now, I'm not
> sure which drive belongs to which. How do I determine that? Does it
> depend on the output of /prod/mdstat and in that order?

You build a 4-disk RAID10 array on Linux by first building two RAID1
pairs, then striping both of the resulting /dev/mdX devices together via
RAID0.  You'll actually have 3 /dev/mdX devices around as a result.  I
suspect you're trying to execute mdadm operations on the outer RAID0, when
what you actually should be doing is fixing the bottom-level RAID1
volumes.  Unfortunately I'm not too optimistic about your case though,
because if you had a repairable situation you technically shouldn't have
lost the array in the first place--it should still be running, just in
degraded mode on both underlying RAID1 halves.

There's a good example of how to set one of these up
http://www.sanitarium.net/golug/Linux_Software_RAID.html ; note how the
RAID10 involves /dev/md{0,1,2,3} for the 6-disk volume.

Here's what will probably show you the parts you're trying to figure out:

mdadm --detail /dev/md0
mdadm --detail /dev/md1
mdadm --detail /dev/md2

That should give you an idea what md devices are hanging around and what's
inside of them.

One thing you don't see there is what devices were originally around if
they've already failed.  I highly recommend saving a copy of the mdadm
detail (and "smartctl -i" for each underlying drive) on any production
server, to make it easier to answer questions like "what's the serial
number of the drive that failed in /dev/md0?".

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Scott Marlowe
Date:
On Wed, Oct 21, 2009 at 12:10 AM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Tue, 20 Oct 2009, Ow Mun Heng wrote:
>
>> Raid10 is supposed to be able to withstand up to 2 drive failures if the
>> failures are from different sides of the mirror.  Right now, I'm not sure
>> which drive belongs to which. How do I determine that? Does it depend on the
>> output of /prod/mdstat and in that order?
>
> You build a 4-disk RAID10 array on Linux by first building two RAID1 pairs,
> then striping both of the resulting /dev/mdX devices together via RAID0.

Actually, later models of linux have a direct RAID-10 level built in.
I haven't used it.  Not sure how it would look in /proc/mdstat either.

>  You'll actually have 3 /dev/mdX devices around as a result.  I suspect
> you're trying to execute mdadm operations on the outer RAID0, when what you
> actually should be doing is fixing the bottom-level RAID1 volumes.
>  Unfortunately I'm not too optimistic about your case though, because if you
> had a repairable situation you technically shouldn't have lost the array in
> the first place--it should still be running, just in degraded mode on both
> underlying RAID1 halves.

Exactly.  Sounds like both drives in a pair failed.

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Greg Smith
Date:
On Tue, 20 Oct 2009, Craig Ringer wrote:

> You made an exact image of each drive onto new, spare drives with `dd'
> or a similar disk imaging tool before trying ANYTHING, right? Otherwise,
> you may well have made things worse, particularly since you've tried to
> resync the array. Even if the data was recoverable before, it might not
> be now.

This is actually pretty hard to screw up with Linux software RAID.  It's
not easy to corrupt a working volume by trying to add a bogus one or
typing simple commands wrong.  You'd have to botch the drive addition
process altogether and screw with something else to take out a good drive.

> If the problem is just a few bad sectors, you can usually just
> force-re-add the drives into the array and then copy the array contents
> to another drive either at a low level (with dd_rescue) or at a file
> system level.

This approach has saved me more than once.  On the flip side, I have also
more than once accidentally wiped out my only good copy of the data when
making a mistake during an attempt at stressed out heroics like this.
You certainly don't want to wander down this more complicated path if
there's a simple fix available within the context of the standard tools
for array repairs.

> On a side note: I'm personally increasingly annoyed with the tendency of
> RAID controllers (and s/w raid implementations) to treat disks with
> unrepairable bad sectors as dead and fail them out of the array.

Given how fast drives tend to go completely dead once the first error
shows up, this is a reasonable policy in general.

> Rather than failing a drive and as a result rendering the whole array
> unreadable in such situations, it should mark the drive defective, set
> the array to read-only, and start screaming for help.

The idea is great, but you have to ask just exactly how the hardware and
software involved is supposed to enforce making the array read-only.  I
don't think the ATA and similar command sets have that concept implemented
in a way you can actually do this at the level it would need to happen at
for hardware RAID to implement this idea.  Linux software RAID could keep
you from mounting the array read/write in this situation, but the way
errors percolate up from the disk devices to the array ones in Linux has
too many layers in it (especially if LVM is stuck in the middle there too)
for that to be simple to implement either.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
Greg Smith
Date:
On Wed, 21 Oct 2009, Scott Marlowe wrote:

> Actually, later models of linux have a direct RAID-10 level built in.
> I haven't used it.  Not sure how it would look in /proc/mdstat either.

I think I actively block memory of that because the UI on it is so cryptic
and it's been historically much more buggy than the simpler RAID0/RAID1
implementaions.  But you're right that it's completely possible Ow used
it.  Would explain not being able to figure out what's going on too.

There's a good example of what the result looks like with failed drives in
one of the many bug reports related to that feature at
https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I
liked the discussion of some of the details here at
http://robbat2.livejournal.com/231207.html

The other hint I forgot to mention is that you should try:

mdadm --examine /dev/XXX

For each of the drives that still works, to help figure out where they fit
into the larger array.  That and --detail are what I find myself using
instead of /proc/mdstat , which provides an awful interface IMHO.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

From
"Ow Mun Heng"
Date:

-----Original Message-----
From: Greg Smith [mailto:gsmith@gregsmith.com]
On Wed, 21 Oct 2009, Scott Marlowe wrote:

>> Actually, later models of linux have a direct RAID-10 level built in.
>> I haven't used it.  Not sure how it would look in /proc/mdstat either.

>I think I actively block memory of that because the UI on it is so cryptic
>and it's been historically much more buggy than the simpler RAID0/RAID1
>implementaions.  But you're right that it's completely possible Ow used
>it.  Would explain not being able to figure out what's going on too.

You're right, the newer linux all support raid10 by default and do not do
the funky Raid1 first then raid0 stuffs combined.

>There's a good example of what the result looks like with failed drives in
>one of the many bug reports related to that feature at
>https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I
>liked the discussion of some of the details here at
>http://robbat2.livejournal.com/231207.html

I actually stumbled onto that (the 2nd link) and tried some of the methods,
but it's actually kinda of outdated I think.

> The other hint I forgot to mention is that you should try:

> mdadm --examine /dev/XXX

> For each of the drives that still works, to help figure out where they fit

> into the larger array.  That and --detail are what I find myself using
> instead of /proc/mdstat , which provides an awful interface IMHO.

That's one of the problem, I'm not exactly sure.

Sda1 = 1
Sdb1 = 2
Sdc1 = 3
Sdd1 = 4

If they are following the sequence, and I'm losing sda1 and sdd1, I
theoretically is supposed to be able to recover them, but I'm not getting
much luck.

FYI.. I've left the box as it is for now and have yet to connect it back up
and all, hence, I can't really post the outputs of /proc/mdstat and
--examine.

But I will once I boot it up.



--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD