Sorry guys, I know this is very off-track for this list, but google hasn't
been of much help. This is my raid array on which my PG data resides.

I have a 4 disk Raid10 array running on linux MD raid.
Sda / sdb / sdc / sdd

One fine day, 2 of the drives just suddenly decide to die on me. (sda and
sdd)

I've tried multiple methods to try to determine if I can get them back
online.

1) replace sda w/ fresh drive and resync - Failed
2) replace sdd w/ fresh drive and resync - Failed
3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed


Raid10 is supposed to be able to withstand up to 2 drive failures if the
failures are from different sides of the mirror.

Right now, I'm not sure which drive belongs to which. How do I determine
that? Does it depend on the output of /prod/mdstat and in that order?

Thanks

Search Discussions

  • Scott Marlowe at Oct 20, 2009 at 8:41 am

    On Tue, Oct 20, 2009 at 1:11 AM, Ow Mun Heng wrote:
    Sorry guys, I know this is very off-track for this list, but google hasn't
    been of much help. This is my raid array on which my PG data resides.

    I have a 4 disk Raid10 array running on linux MD raid.
    Sda / sdb / sdc / sdd

    One fine day, 2 of the drives just suddenly decide to die on me. (sda and
    sdd)

    I've tried multiple methods to try to determine if I can get them back
    online.

    1) replace sda w/ fresh drive and resync - Failed
    2) replace sdd w/ fresh drive and resync - Failed
    3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
    4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed


    Raid10 is supposed to be able to withstand up to 2 drive failures if the
    failures are from different sides of the mirror.

    Right now, I'm not sure which drive belongs to which. How do I determine
    that? Does it depend on the output of /prod/mdstat and in that order?
    Is this software raid in linux? What does

    cat /proc/mdstat

    say?
  • Craig Ringer at Oct 20, 2009 at 11:03 am

    On 20/10/2009 4:41 PM, Scott Marlowe wrote:

    I have a 4 disk Raid10 array running on linux MD raid.
    Sda / sdb / sdc / sdd

    One fine day, 2 of the drives just suddenly decide to die on me. (sda and
    sdd)

    I've tried multiple methods to try to determine if I can get them back
    online
    You made an exact image of each drive onto new, spare drives with `dd'
    or a similar disk imaging tool before trying ANYTHING, right?

    Otherwise, you may well have made things worse, particularly since
    you've tried to resync the array. Even if the data was recoverable
    before, it might not be now.



    How, exactly, have the drives failed? Are they totally dead, so that the
    BIOS / disk controller don't even see them? Can the partition tables be
    read? Does 'file -s /dev/sda' report any output? What's the output of:

    smartctl -d ata -a /dev/sda

    (repeat for sdd)

    ?



    If the problem is just a few bad sectors, you can usually just
    force-re-add the drives into the array and then copy the array contents
    to another drive either at a low level (with dd_rescue) or at a file
    system level.

    If the problem is one or more totally fried drives, where the drive is
    totally inaccessible or most of the data is hopelessly corrupt /
    unreadable, then you're in a lot more trouble. RAID 10 effectively
    stripes the data across the mirrored pairs, so if you lose a whole
    mirrored pair you've lost half the stripes. It's not that different from
    running paper through a shredder, discarding half the shreds, and lining
    it all back up.


    On a side note: I'm personally increasingly annoyed with the tendency of
    RAID controllers (and s/w raid implementations) to treat disks with
    unrepairable bad sectors as dead and fail them out of the array. That's
    OK if you have a hot spare and no other drive fails during rebuild, but
    it's just not good enough if failing that drive would result in the
    array going into failed state. Rather than failing a drive and as a
    result rendering the whole array unreadable in such situations, it
    should mark the drive defective, set the array to read-only, and start
    screaming for help. Way too much data gets murdered by RAID
    implementations removing mildly faulty drives from already-degraded
    arrays instead of just going read-only.

    --
    Craig Ringer
  • Greg Smith at Oct 21, 2009 at 6:30 am

    On Tue, 20 Oct 2009, Craig Ringer wrote:

    You made an exact image of each drive onto new, spare drives with `dd'
    or a similar disk imaging tool before trying ANYTHING, right? Otherwise,
    you may well have made things worse, particularly since you've tried to
    resync the array. Even if the data was recoverable before, it might not
    be now.
    This is actually pretty hard to screw up with Linux software RAID. It's
    not easy to corrupt a working volume by trying to add a bogus one or
    typing simple commands wrong. You'd have to botch the drive addition
    process altogether and screw with something else to take out a good drive.
    If the problem is just a few bad sectors, you can usually just
    force-re-add the drives into the array and then copy the array contents
    to another drive either at a low level (with dd_rescue) or at a file
    system level.
    This approach has saved me more than once. On the flip side, I have also
    more than once accidentally wiped out my only good copy of the data when
    making a mistake during an attempt at stressed out heroics like this.
    You certainly don't want to wander down this more complicated path if
    there's a simple fix available within the context of the standard tools
    for array repairs.
    On a side note: I'm personally increasingly annoyed with the tendency of
    RAID controllers (and s/w raid implementations) to treat disks with
    unrepairable bad sectors as dead and fail them out of the array.
    Given how fast drives tend to go completely dead once the first error
    shows up, this is a reasonable policy in general.
    Rather than failing a drive and as a result rendering the whole array
    unreadable in such situations, it should mark the drive defective, set
    the array to read-only, and start screaming for help.
    The idea is great, but you have to ask just exactly how the hardware and
    software involved is supposed to enforce making the array read-only. I
    don't think the ATA and similar command sets have that concept implemented
    in a way you can actually do this at the level it would need to happen at
    for hardware RAID to implement this idea. Linux software RAID could keep
    you from mounting the array read/write in this situation, but the way
    errors percolate up from the disk devices to the array ones in Linux has
    too many layers in it (especially if LVM is stuck in the middle there too)
    for that to be simple to implement either.

    --
    * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
  • Greg Smith at Oct 21, 2009 at 6:10 am

    On Tue, 20 Oct 2009, Ow Mun Heng wrote:

    Raid10 is supposed to be able to withstand up to 2 drive failures if the
    failures are from different sides of the mirror. Right now, I'm not
    sure which drive belongs to which. How do I determine that? Does it
    depend on the output of /prod/mdstat and in that order?
    You build a 4-disk RAID10 array on Linux by first building two RAID1
    pairs, then striping both of the resulting /dev/mdX devices together via
    RAID0. You'll actually have 3 /dev/mdX devices around as a result. I
    suspect you're trying to execute mdadm operations on the outer RAID0, when
    what you actually should be doing is fixing the bottom-level RAID1
    volumes. Unfortunately I'm not too optimistic about your case though,
    because if you had a repairable situation you technically shouldn't have
    lost the array in the first place--it should still be running, just in
    degraded mode on both underlying RAID1 halves.

    There's a good example of how to set one of these up
    http://www.sanitarium.net/golug/Linux_Software_RAID.html ; note how the
    RAID10 involves /dev/md{0,1,2,3} for the 6-disk volume.

    Here's what will probably show you the parts you're trying to figure out:

    mdadm --detail /dev/md0
    mdadm --detail /dev/md1
    mdadm --detail /dev/md2

    That should give you an idea what md devices are hanging around and what's
    inside of them.

    One thing you don't see there is what devices were originally around if
    they've already failed. I highly recommend saving a copy of the mdadm
    detail (and "smartctl -i" for each underlying drive) on any production
    server, to make it easier to answer questions like "what's the serial
    number of the drive that failed in /dev/md0?".

    --
    * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
  • Scott Marlowe at Oct 21, 2009 at 6:25 am

    On Wed, Oct 21, 2009 at 12:10 AM, Greg Smith wrote:
    On Tue, 20 Oct 2009, Ow Mun Heng wrote:

    Raid10 is supposed to be able to withstand up to 2 drive failures if the
    failures are from different sides of the mirror.  Right now, I'm not sure
    which drive belongs to which. How do I determine that? Does it depend on the
    output of /prod/mdstat and in that order?
    You build a 4-disk RAID10 array on Linux by first building two RAID1 pairs,
    then striping both of the resulting /dev/mdX devices together via RAID0.
    Actually, later models of linux have a direct RAID-10 level built in.
    I haven't used it. Not sure how it would look in /proc/mdstat either.
    You'll actually have 3 /dev/mdX devices around as a result.  I suspect
    you're trying to execute mdadm operations on the outer RAID0, when what you
    actually should be doing is fixing the bottom-level RAID1 volumes.
    Unfortunately I'm not too optimistic about your case though, because if you
    had a repairable situation you technically shouldn't have lost the array in
    the first place--it should still be running, just in degraded mode on both
    underlying RAID1 halves.
    Exactly. Sounds like both drives in a pair failed.
  • Greg Smith at Oct 21, 2009 at 6:59 am

    On Wed, 21 Oct 2009, Scott Marlowe wrote:

    Actually, later models of linux have a direct RAID-10 level built in.
    I haven't used it. Not sure how it would look in /proc/mdstat either.
    I think I actively block memory of that because the UI on it is so cryptic
    and it's been historically much more buggy than the simpler RAID0/RAID1
    implementaions. But you're right that it's completely possible Ow used
    it. Would explain not being able to figure out what's going on too.

    There's a good example of what the result looks like with failed drives in
    one of the many bug reports related to that feature at
    https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I
    liked the discussion of some of the details here at
    http://robbat2.livejournal.com/231207.html

    The other hint I forgot to mention is that you should try:

    mdadm --examine /dev/XXX

    For each of the drives that still works, to help figure out where they fit
    into the larger array. That and --detail are what I find myself using
    instead of /proc/mdstat , which provides an awful interface IMHO.

    --
    * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
  • Ow Mun Heng at Oct 21, 2009 at 9:16 am
    -----Original Message-----
    From: Greg Smith
    On Wed, 21 Oct 2009, Scott Marlowe wrote:

    Actually, later models of linux have a direct RAID-10 level built in.
    I haven't used it. Not sure how it would look in /proc/mdstat either.
    I think I actively block memory of that because the UI on it is so cryptic
    and it's been historically much more buggy than the simpler RAID0/RAID1
    implementaions. But you're right that it's completely possible Ow used
    it. Would explain not being able to figure out what's going on too.
    You're right, the newer linux all support raid10 by default and do not do
    the funky Raid1 first then raid0 stuffs combined.
    There's a good example of what the result looks like with failed drives in
    one of the many bug reports related to that feature at
    https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I
    liked the discussion of some of the details here at
    http://robbat2.livejournal.com/231207.html
    I actually stumbled onto that (the 2nd link) and tried some of the methods,
    but it's actually kinda of outdated I think.
    The other hint I forgot to mention is that you should try:
    mdadm --examine /dev/XXX
    For each of the drives that still works, to help figure out where they fit
    into the larger array. That and --detail are what I find myself using
    instead of /proc/mdstat , which provides an awful interface IMHO.
    That's one of the problem, I'm not exactly sure.

    Sda1 = 1
    Sdb1 = 2
    Sdc1 = 3
    Sdd1 = 4

    If they are following the sequence, and I'm losing sda1 and sdd1, I
    theoretically is supposed to be able to recover them, but I'm not getting
    much luck.

    FYI.. I've left the box as it is for now and have yet to connect it back up
    and all, hence, I can't really post the outputs of /proc/mdstat and
    --examine.

    But I will once I boot it up.



    --
    * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedOct 20, '09 at 7:25a
activeOct 21, '09 at 9:16a
posts8
users4
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase