Re: System Administration needs more PhDs

Tom Limoncelli has a post up today entitled System administration needs more PhDs.

He makes some great observations and brings up a lot of interesting questions. The one that I think the others flow from is “Why are good practices so rarely adopted?”

My opinion, gained through observation, is that sysadmins arise from one of two places. Either they start out in relative isolation, or they come from an environment with multiple systems administrators.

The former develop their own ways of doing things through trial and error and/or research. This leads to endless ways of accomplishing the same or similar tasks. The utter heterogeneity of possible platform combinations lends itself to having each admin reinvent the wheel.

The latter typically have an established infrastructure in place, a well defined set of hardware, and a much more rigid structure of procedures and usually a bona fide methodology for change management.

The reason that the standalone sysadmin almost never resembles the well trained sysadmin is because best practices all seem to be vendor driven, reliant on a subset of devices and situations, and are hidden as well as possible behind support and contract agreements.

Those are hurdles the lone sysadmin faces AFTER he has discovered the “optimal solution”, whatever that is. You mention puppet. Should you use cfengine or puppet? Unless you know about puppet, you’ll use cfengine, unless you haven’t heard of that either, in which case you’ll roll your own. In my experience, you’ll find $betterSolution right as you’re implementing $bestSolutionYouKnowAbout.

I don’t know whether there are more sysadmins in a single environment than in a plurality, but there are a _lot_ of sysadmins out there by themselves.

By themselves, sysadmins rely on their own cleverness, but together you get a synergy of ideas. The whole becomes smarter than the sum of the individuals, but most sysadmins never get to experience that. That’s one of the reasons I started my blog. To shed light on what other people are doing, how they operate in their organizations, and so on.

Your books are a great resource for sysadmins, but the lone sysadmins of the world need to start communicating between themselves, and with the “institutional” admins out there. The same solution won’t always work, but the sharing will go a way toward a meritocracy of

Dell’s DRAC card sucks

I’ve worked with the my 1855 blade enclosure for a while, now, and I feel pretty confident in saying the following:

A) Dell’s DRAC is a very useful device which facilitates remote administration

B) at least it would, if it didn’t suck so much

The blade enclosure comes with a DRAC module that is inserted into a slot in the back. It’s paired with an avocent KVM module connected to the blade units. You access the KVM through the DRAC web site, which is the real problem.

Every Dell technician I’ve complained to has said the same thing. “Yes, the DRAC is slow. Very slow, and underpowered”. It’s not just slow, it vacillates between borderline and completely unusable. On a good day, expect 3 minutes for the page to load. On a bad day, don’t expect to load the whole page.

It also seems to occasionally lose track of the KVM. It’s happened a few times so far, and there doesn’t seem to be any reason for it. Either the DRAC will see the KVM and not be able to administer it (like what is happening right now), or the DRAC won’t see the KVM at all.

It is very frustrating, and Dell’s techs seem apologetic, but there doesn’t appear to be a fix for it. It just sucks.

Next time I’m headed into the colocation, I’m just hooking the console port into the KVM that is connected to the non-blade servers. At least that way I’m not reliant on Dell’s sorry excuse for a controller to access the video on my servers.