Timing is Key

Date January 28, 2011

I don't know if you've heard, but my area in New Jersey has gotten a lot of snow lately. Right now, I'm looking at the back parking lot, and there's snow around two feet deep everywhere that isn't plowed or covered by 12 foot plow droppings. Earlier this week, we were at work as we started to get the most recent snowfall, and I was trying to encourage my junior admin to leave the office before it hit. "I don't know, it looks alright", he said, and I replied,

"If you wait until it's obvious, it's too late"

It sort of hit me. That last sentence has become my defacto motto when it comes to a lot of things. I think the first time it occurred to me was when I started researching IPv6. The depletion of IPv4 is no surprise, and hasn't been for quite a while, but it seems like most people are holding off even researching it until it becomes obvious that they need it. Again, by that time, if you're in any kind of competitive company vying for market position, it'll be too late. It won't be obvious that it's necessary until you see people financially punished for not taking those steps.

Another thing I've noticed within my own company is that so much of the concentration is on day to day operations that it's rare when someone steps back and thinks, "is this really a sustainable practice?". It's getting better since we created a QA role, but still, the first thing I do after putting out a fire is to figure out why something wasn't made flame-retardant a long time ago.

I'm going through practices that we have on paper but aren't performing (like enforced data lifetimes), plus I'm implementing things that will help the environment scale further, such as sharding my data sets. Only through continuing to manage these things toward the direction I want them to go can I prevent future disasters. If I wait until it's obvious that there's a problem, then it's too late.

I know that I'm not the only one. My friend Robin Harris published a column back in 2007 titled Why RAID 5 stops working in 2009 which basically explained that the rate of unrecoverable read errors (UREs) wasn't decreasing at the same rate that disk capacity was increasing, and that it was soon going to be common that the size of your RAID array would nearly guarantee a URE when a drive fails. He was right; right now, no one would recommend RAID-5 for an array of any size.

Last year, he published a follow-up article, "Why RAID-6 stops working in 2019". He's right again, if not sooner.

Our jobs are hard. They take up a lot of time, but the day to day requirements of user and machine maintenance are only part of the story. We need to keep an eye out for oncoming trucks, especially when no one else is doing that job for our company.

What kinds of things do you see looming? What have you noticed but others have missed? Share your thoughts in the comments, because you might open someone's eyes to a problem they never knew was sneaking up on them.

6 Responses to “Timing is Key”

  1. Tweets that mention Timing is Key | Standalone Sysadmin -- Topsy.com said:

    [...] This post was mentioned on Twitter by Matt Simmons, Jeff Hengesbach, Trever Miller, Andreas Olsson, Thought Provoking and others. Thought Provoking said: Timing is Key - If you wait until it's obvious, it's too late: Comments http://digfoc.us/e3OSbr [...]

  2. Fred Woodbridge said:

    This immediately put me in mind of the OODA loop. Do you know about it? It essentially says the same thing about the obviousness of a thing before action/reaction.

  3. Scott said:

    I've pretty much given up on anything but RAID 10 at this point. And in a large array, I still have 2 hotspares.

  4. Matt Simmons said:

    I've seen OOBA, but I haven't implemented it as such. I can see the similarities, though.

    I've got 2 hot spares as well. It's the only way to feel alright about a drive failure.

  5. Study: Obvious and Hidden of OODA by Ho-Sheng Hsiao - Quora said:

    [...] to Questions, Topics and PeopleAddFind Questions, Topics or PeopleCancelSuggest Edits  http://www.standalone-sysadmin.c...“If you wait until it’s obvious, it’s too late“It sort of hit me. That last sentence has [...]

  6. IANA Finally Out of IPv4 Addresses | Standalone Sysadmin said:

    [...] it obvious yet? &laquo Previous Entry Timing is Key Send this post [...]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>