Well, that was a thrilling exercise…

Date January 13, 2010


Every once in a while, we’ve got to exercise those “data recovery” muscles. It’s not typically fun, it can wind up in cold sweats, but if you’ve planned and worked hard, it usually resolves itself alright.

A while back, I was told that I had to do a historic search through our email for a couple of strings, and that it was due in the middle of this month. Of course, I procrastinated. I mean heck, this was a couple of months ago. A couple of months ago, the assignment was due next year. Why rush? Right?

Fast forward to last week, when the assignment was no longer due next year, in fact, I was informed that it was due next week. Awesome. So I made immediate plans to extract the data from the antiquated mail server that was originally installed circa-2005. We keep it around for sentimental reasons, and also because it still houses a ton of mail that we haven’t imported into Global Relay.

I was designing the most efficient plan of attack in the back of my mind while I attended the LOPSA-NJ meeting when I got an email. Next week? Not so much. The entirety of the emails were required the next morning, in full. Looking at the top of my phone, I saw that it was 9:30pm. I estimated that after the 45 minute drive home, plus the 15 minute trip to work, I could probably bang out the emails and be home before midnight. As it turns out, I was a little off. I recall stumbling into bed at 4am on Friday morning. Vaguely.

Friday went smoothly, probably. I don’t remember, exactly, but I showed up and apparently made it home, and had a relaxing weekend which involved more sleep than the entire week before. Which was good, because on Monday I found out that it wasn’t enough to extract the emails. I had to provide every client report mentioned, referenced, or otherwise alluded to in the entire span of messages. Awesome.

Now, the data that I had dragged out of the mail server was vintage 2005-2006. And from clients we no longer had. I don’t know about where you’re from, but 5 year old data from legacy clients doesn’t sit around on my SAN eating up expensive drive space. It was on a tape. Unfortunately, it was on a 5 year old tape. Depending on the exact month, it could have been on a tape 2 generations old or 3 generations old.

As luck would have it, it was on a 2 generation old tape, a VXA-2, to be precise. The tape itself was a VXA-3, but we hadn’t upgraded the tape drive in the changer yet, which is fortunate, because the tape drive in the changer was shot. Shot as in “dd if=/dev/zero of=/dev/st1 bs=1k count=100” pulled an I/O error after 3 minutes. Awesome.

As luck would have it, when we originally ordered the tape changer, we ordered an equivalent standalone drive for our other office, which would serve as a backup site, if need be. I pulled the drive off the shelf, unplugged the tape library from the now-unpowered server, wired it up to the standalone drive, powered everything on, and crossed every filange I owned.

As it turns out, the standalone drive worked, owing in a large part, I’m sure, to the fact that it had probably read 2 or 3 tapes in it’s long life, uneventful life. At this time, it was around 4pm. I located a tape set from an appropriate date range that probably still included the client files, popped in the first tape, and used ‘tar’ to get a directory listing, praying that it didn’t give me an I/O error. And my prayers were rewarded by a reassuring list of hundreds of files. I killed the output, initiated a screen session, and started extracting to a volume that had a couple of tapes worth of space free.

This morning, I came back in to check on it. I wasn’t asked to insert the next tape until nearly 10am. I can’t imagine why we moved beyond VXA. By noon, the client data that I needed had been extracted completely. I verified it for sanity, and sent an exuberant instant message that I had all of the data, and everything was fine.

This isn’t the first time I’ve had to pull old data off of a tape, but it is a rarity in my company, something that I’m thankful for. Typically, my biggest problem is locating the right tape set from the right date. My old tapes are in somewhat of a disarray, and I wasn’t always so clear as I am now about what goes on a tape. I was shuddering as I went back through and saw things like “Sunday, 15th. File Sync”. Well, that’s helpful. I’d go back in time and smack myself, but I have no idea when I made that mistake.

In any event, I recovered the data that I needed. There IS a punchline to this story though. A few minutes after I sent the IM, I got a visit at my desk. I was thanked heartily for my efforts, but as it turned out, the original request for the data had been made erroneously. We didn’t need to provide any emails or client data related to the original search. There may be some other search terms that I might need to look for though…

Oh, and lest there be any doubt. While I’ve got the drive hooked up and working, it’s time to migrate some old data onto next-gen tapes. Progress!



7 Responses to “Well, that was a thrilling exercise…”

  1. Warll said:

    Do tapes still have the storage and economic advantage? All I’ve ever heard are horror stories about failing tapes, wouldn’t it make more sense to rotate through redundant consumer harddrives every few years? Has a harddrive ever failed will sitting on a shelf?

  2. jtimberman said:

    I have a soft spot for VXA, mainly because I did a *lot* of testing of the VXA-1 drive by Ecrix for Linux when I worked for the BRU guys. I liked the drive so much, when it was time to buy a tape drive for my own home server backups, I went VXA-1.

    It’s still in the server and works, though I use other means for backup (mainly, DVD, for that which isn’t handled via Git repository.

  3. Chris said:

    Most places I come across have a policy of rolling forward tapes, or staying with tapes that have some sort of extended support, like the Ultrium LTO tapes.

    Tapes are good for long term archival, as long as they are stored properly. I’ve seen many tapes that have degraded over time, or have grown come curious stuff on them.

    Warll, hard drives are fairly popular for backups now, but tapes still rule the archival world.

  4. AJ L. said:

    For what I do here I use DVDs for archived backups but nothing I have here is more then a year old so….

  5. Jim said:

    So far we’ve had good luck with tape (knocks on wood). Due to the nature of our work, everything we’ve worked on for the past ten years is archived in some form on tape. And so far, the tape has held out well for us. When last I checked, we had somewhere around 40TB of data backed up on AIT1,2 and 4 tapes. One of our current (and most often put off) projects is moving all that old AIT1 and 2 to newer tapes.

  6. Matt Simmons said:

    WarII: There’s nothing wrong with storing data on hard drives.The only long term problem would be 10 years later, finding a computer with the appropriate drive interface. We’re still transitioning between PATA and SATA interfaces, but it’s getting harder to find a mobo that still has PATA. It’s probably going to disappear completely soonish, and then our only recourse will be adding a PCI card with a PATA bus on it. The same could be said of tapes and drives, or drive interfaces and SCSI bus, too.

    In the end, we’ve got a tape infrastructure, so we use tape. I do plan on switching my dailies from tape to virtual tapes Real Soon Now. That way, I’m not slowly destroying the moving parts in a tape drive, like I have been.

  7. Anthony said:

    More interesting to me is all the unnecessary work done. While it sounds like you procrastinated ‘unintentionally’, I’ve found over the years that it is usually a wise idea to wait a few days before working on something like this. At least 90% of the time the request changes in the first 3 or 4 days to something completely different. Usually because they come to you asking for things before they know what they really want.

    I’ve also found that a good percentage of time, even after finding out what is really wanted, you go through the work only to find they don’t need it anymore for whatever reason.

    A marketing company I used to work for would register domain names for clients on the off chance that they might sell that idea to the client. I don’t know how many thousands of domains they went through before even pitching the concept to the clients. Half the time the domains they registered didn’t even get pitched. Granted it isn’t all that much money in the greater scheme of things but still…

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Easy AdSense by Unreal

Switch to our mobile site