This article has been archived for easier access in the future. I found it in the middle of the night searching for solutions after a storage crash on the holyhandgrenade.org website, but the article website seems to be hacked since then and the content began to disappear, so I archived it here in case somebody suffers in the middle of the night like me and this can save one's life after spending several hours tracking down the core of the problem. All credits go to the original author, Jeff.
First, some keyword spam so this turns up to people who need it: this should apply to all IBM Midrange Series Storage SANs including the DS3200, DS3300, DS3400, DS3950, DS4000, DS4100, DS4200, DS4300, DS4400, DS4500, DS4700, DS4800, DS5020, DS5100, and DS5300. (Whew.)
SANs are important, mission-critical pieces of storage hardware, and as we all know, it’s important to manage change in the environment. However, sometimes mistakes happen — sooner or later, someone is going to delete the wrong LUN. IBM doesn’t really make clear how to recover this without technical support involved, and I can understand why — it’s an important thing to get right.
However, especially late at night when IBM’s Remote Support Center runs on a skeleton crew and can take a few hours to turn around a ticket, we can’t always rely on a timely response from IBM support in order to recover the disk. Since this is a largely undocumented procedure, I’m going to put it out there in the hopes that it helps someone else.
When doing advanced work, I tend to work from the command line using the SMcli utility. However, you can also run scripts in the graphical Storage Manager application. The functionality is oddly hidden in the root window of the DS Storage Manager 10 client, on the screen where you choose your SAN to manage. To access it, right-click your SAN and click “Execute Script.” The script editor window will open. (It would make a lot more sense to put this functionality into the Advanced menu of one of the managed SANs.)
The command-line reference guide for IBM Midrange Storage (LSI) SANs makes mention of the recover logicalDrive command:
recover logicalDrive (drive=(enclosureID,drawerID,slotID) | Drives=(enclosureID1,drawerID1,slotID1 … enclosureIDn,drawerIDn,slotIDn) | array=ArrayName) [newArray=arrayName] userLabel="logicalDriveName" capacity=logicalDriveCapacity offset=offsetValue raidLevel=(0 | 1 | 3 | 5 | 6) segmentSize=segmentSizeValue [owner=(a | b) cacheReadPrefetch=(TRUE | FALSE)]
However, it doesn’t tell you where to get the LUN sizes, segment sizes, offsets and other numbers that you need to facilitate a successful recovery. Well, luckily, there’s a couple of places you can turn it up.
If you’ve collected support data recently, you can look inside the support bundle .zip and locate a file called recoveryProfile.csv. If you don’t have a support bundle handy, you might still be in luck — the DS Storage Manager application keeps a copy in its program directory, and you can usually find it at C:\Program Files\IBM_DS\client\data\recovery, ending in _Recovery_Profile.csv and named for the SAN you’re managing. Look at all the lines beginning with Volume, and locate the one that contains the LU name that you’re looking for. It should look like this:
As far as I can tell, the fields are:
- Object type (volume, volume group, etc.)
- Volume NAA ID
- Volume name
- Owning array NAA ID
- Block size (typically 512; this might be 4096 on SSD or high-capacity disks with 4k blocks, but I have none of these to test with)
- LUN size in bytes
- Starting offset; on this LUN the unit appears to be (bytes / 2048) but I can’t figure out why
- Segment size in bytes
- Two integers/booleans I haven’t identified
You can take this information and feed it right back into that recover logicalDrive command from the guide:
SMcli -n My_SAN -p My_Password -c 'recover logicalDrive array=My_Array userLabel="My_LU" capacity=805306368000 offset=393216000 raidLevel=5 segmentSize=64;'
Note that the segment size needs to be converted from bytes into kilobytes.
One thing I haven’t figured out is how to preserve the old NAA ID on the LUNs, if this is at all possible. This generally isn’t important, but notably can cause problems with signaturing in VMware.
Expect a follow-up post on restoring an entire physical array.