in Hardware, Storage

replacing a failed Sun LVM mirror

The problem with mirroring your disks is that one side of the mirror will invariably fail two weeks later. This has happened to me several times, first under NetBSD (with its excellent RAIDFrame technology, a worthy competitor, functionally, to Sun Volume Manager) and now with the Sun LVM mirror that I set up several weeks ago and documented in this very blog.

I called Sun support, and they shipped me a new disk. Here’s how I went about replacing the failed device, without incurring any downtime (yay, Sun hot-swappable parts)!

From my previous entry, you’ll remember that I have five mirrors: d10, d20, d30, d40 and d50. Each of these has submirrors d11 and d12, d21 and d22, and so on. Because the second disk failed, I needed to remove submirrors d12, d22, d32, d42 and d52. This was accomplished with the following series of commands:

# metadetach -f d10 d12
d10: submirror d12 is detached
# metaclear -f d12
d12: Concat/Stripe is cleared
# metadetach -f d50 d52
d50: submirror d52 is detached
# metaclear -f d52
d52: Concat/Stripe is cleared

Before actually removing the drive, I still needed to delete the state database replicas that resided on the failed drive:

# metadb -d c1t1d0s4

Now I could use the Solaris device management commands to offline the drive and remove it: cfgadm -al to get the Ap_id of the device, and then cfgadm -c unconfigure c1::dsk/c1t1d0 to remove the disk in my case. (On the Sun Fire v210, the blue service LED will light next to the disk, indicating that it’s safe to remove.)

Once the disk has been replaced, we can cfgadm -c configure c1::dsk/c1t1d0 to mark the new disk online and then proceed to rebuild the LVM mirror. First, we need to copy the Volume Table of Contents (VTOC) from the working device to the replacement device:

# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t1d0s2

Then rebuild the state database replicas on the replacement device:

# metadb -a -c 2 c1t0d0s4

Finally, create the replacement submirrors and attach them to the mirrors:

# metainit d12 1 1 c1t1d0s0
# metattach d10 d12
(for every submirror)

Now, doing metastat should show all the mirrors in a Rebuilding state. You’re done!

Write a Comment