March 12, 2010
My company has decided that I need to learn more about administration of the Postgres database…which is to say that I should learn something about it. My knowledge is really pretty scant at the moment.
To that end, they’re sending my boss, my junior admin, and me to PostgreSQL East, a conference held in Philadelphia from March 25-28th. We’re doing the conference thing, plus doing training on Sunday.
Anyone out there attending, too?
Posted in General
2 Comments »
March 10, 2010
Before I start, I just want you to know that I’m not whining, I just thought I’d give this as an example of some of the things that people who run small infrastructures are left out of…
Today I’m sitting in the office in NJ, doing work as normal. What I’d prefer to be doing is going to the IT Roadmap Conference & Expo in NYC. According to the website, it’s “designed for IT professionals who want to cover multiple industry topics in one day”. That sounds like something I’d be interested in!
Essentially, it’s a sales pitch, or a series of sales pitches. I don’t know if I’m in the market for what they’re selling, but I’d like to go find out what is being offered. All the same, I like to keep my eyes on the horizon, because things have a habit of coming up quick on us in IT, and if we don’t familiarize ourselves with the likely technology of the next few years, then we’ll be caught with our pants down. So I wanted to see what people were selling.
The conference is free. All you have to do is fill out the application for registration. Unfortunately, I don’t qualify:
Dear Matt,
Thank you for your interest in Network World Live’s IT Roadmap Conference & Expo in New York.
Unfortunately, after reviewing the information that you submitted, we determined that at this time, we are not able to confirm your seat on a complimentary basis.
As we noted on the registration form, this event is geared towards network and IT professionals in end-user type companies who actively purchase products and services – or – who will be doing so in the near future. We have a limited number of complimentary seats reserved for attendees who meet this criteria.
…
snip
…
Walk-ins or ineligible applicants arriving at the conference facility will NOT be admitted on the day of the event.
Thank you,
IT Roadmap Team
Network World Events & Executive Forums
(emphasis theirs)
Well, I do actively purchase technologies and products, but not at the scale that they’re looking for, I suppose. I don’t have 50 data centers, or “20,000 or more” servers, so I don’t get to go to their party and look at the toys.
It’s unfortunate for them and me, but somehow I think I’ll live. I just wanted to give you a tangible example of…well…I won’t go so far as to say discrimination, but maybe exclusion, that we small admins deal with from vendors.
Posted in General
11 Comments »
March 8, 2010
Or as (ir)regular as they normally are. I really hope that you enjoyed the flashback week, and got something useful from it. I’m going to try to do it again next year on the first full week of March.
Now it’s just back to the daily grind for me. I’ve been rehashing some Nagios configuration and I’ve unearthed an ancient relic! How fun! Configuration archaeology is a hobby of mine, and to find a gem that hasn’t (as far as I can tell) been mentioned on the official site since 2002? That’s GREAT! I’ve still got to go through the source code to make sure that it doesn’t do anything interesting, but it’s out of my config now.
As it turns out, my recent attention to Nagios is multifaceted. I’m cleaning up the config and tightening up the alert rules, but also, I’m going to be giving a 45 minute talk at the Professional IT Community Conference in May. If you’re in the northeast US, you should definitely make it! And you should hurry and register while the early bird special is going!
Posted in General
3 Comments »
March 5, 2010
You are probably a human. At least, the statistical odds are in your favor. As a human, you experience stress, and how you react to it plays a large part in determining how happy you are. System administrators deal with stress particularly poorly, in general. We assume the role of hero and that’s that. Do what it takes, bask in whatever glory accompanies the successful completion of our task.
There is no downtime in that equation. Immediately following those emergencies, most of us drink depressants to bring ourselves down. On normal days, we require morning stimulants to bring ourselves up. I highly suspect that some of us are so called “adrenaline junkies” from the relative high that we get when there’s an immediate problem that no one can solve but ourselves.
This is unhealthy.
What we really need is to be able to step back and look at the pattern in our lives and say I don’t want to live with this stress.
When it first hit me that stress is probably the biggest single microproblem for admins, I wrote the following. I hope you find it relevant.
Jack Hughes, over at the Tech Teapot, mentions a very appropriate subject for too many systems administrators:
burnout.
As sysadmins, we’re nearly always the go-to person for whatever happens. After a while, we start to get used to it, and lots of times, we can develop a hero complex, carrying the weight of the world on our shoulders, at least in our minds. This isn’t healthy for a lot of reasons, the most important of which is your health.
Here’s an example of what taking your job too seriously can do to you:
Part One
Part Two
Not to ruin the ending, but the most disgusting part is that, while the guy was taking medical leave, his company fired him. To be completely honest, he’s much better off without a company like that, and if your company would do the same thing, then so are you.
To quote Peter Gibbons, “We don’t have a lot of time on this earth. We weren’t meant to spend it this way. Human beings were not meant to sit in little cubicles staring at computer screens all day…”
Even one of the most preeminent Systems Administrators around, Tom Limoncelli advocates leaving the pressure at work when you head home. For those of us on call 24/7/365, that can be a little hard, but it’s important to try.
Posted in General
12 Comments »
March 3, 2010
It’s the end of a long day. You lean back in your chair, sigh, and you’re glad it’s time to go home. Someone asks you what you did all day. You just sort of shake your head and say “fought fires”.
Fire fighting, as a sysadmin, means you don’t make any progress. You only work very hard to stay where you are. Working against entropy is difficult, and it can take a lot out of you. Some days are harder than others.
One day in early June, not long after I started this blog, I experienced a major setback. Also, a major power outage. Our entire backup facility lost power, and what’s worse, the generator refused to kick on. Our secondary site was down hard for days, until the power was restored to the downtown area of the village we were located in.
During the problem, though, we were able to turn a major issue into a net gain. Read on for the rest of the story…
It’s funny, sometimes, how we tolerate suboptimal or downright malproductive arrangements in our infrastructures, just because it’s inconvenient or inopportune to do it the “right way”. It seems like “the right way” either never comes, due to projects getting phased out, or it gets fixed during a cataclysmic upheaval, when it has become an immediate concern.
The case in point is my mail server. We have an A and a B mx record. Originally the B MX just stored mail until the A came back up, then it would get delivered. Everyone checks mail on A, so it can’t really be down during the day, and about 6 months ago, the office that B was at relocated and B was never set up. This left us with just A. To make matters worse, A was old enough that it was physically located in our backup site, which used to be our primary site. This was suboptimal. Of course there was talk about moving it to the primary site, but when could a maintenance window be created? And we’d risk the entire period of non-connectivity when it was being moved. No, management said, lets just leave it where it was.
Great strategy. It actually worked fine though, until this weekend.
I came in on Saturday, ready to do some major work on the blade systems I’m building for our new site. I sat down at my desk, ready to dive into work. Since I was alone, Raiders of the Lost Ark was playing on the laptop. I had just logged into the first server when the lights went off, and the telltale screech and whine from the server room told me that we’d lost main power.
In Granville, OH, that’s not a strange thing. We’ve got backup AC and a backup generator, so I wasn’t worried. It does have to be manually started, so I jogged into the server room and turned on the CFL floor lamp. At least I tried to. I looked at the generator control panel and it confirmed my fears. No generator power.
I tried for several minutes to start it, but nothing gave me the impression that anything would change, so I called my boss to let him know the situation, and that I was going to start shutting down machines. Since the only critical thing was mail, I suggested that he change DNS to point to an as-yet unassigned IP at the colocation, and that I could setup a postfix process there to queue the mail. He said that it would work, but he suggested an alternative approach.
Why not relocate the physical mail server to the colocation? A lightbulb went off. Of course, not only could I take care of that long standing problem, but because there was no power at all in the datacenter, the normal policy of no-downtime-for-repairs-and-upgrades was out the window.
The next morning, I left work to go home at 5am. The previous 15 hours had been spent completely rehauling the backup datacenter. With the mail relocated to the primary facility, once the power came on in the backup, I had free reign to cull everything unnecessary that had been accumulating.
There is now a pile of cables covering a square yard or so around 6 inches deep of power, ethernet, and copper/fiber cables. There are something like 96 ports worth of switches that I took out, multiple servers, KVMs, fiber switches, and general cruft. The servers are also arranged so that no half-depth servers are hiding between full depth. That was always a pet peeve of mine.
I thought about it while I was doing this, and if fighting normal issues is considered firefighting, then what I went through should have been considered forestfire fighting. And just like a forest fire, good can come from it. It takes the massive heat of a forestfire to crack open some pine cones. It also takes massive infrastructure downtime to make significant changes.
Posted in General
3 Comments »
March 2, 2010
This is a short bit that I wrote when I was considering overhauling the internal naming scheme at my company. We used to use an odd mismash of names, and we used to have multiple invented internal DNS names, that referred to the physical location. And I don’t mean things like “location.example.com” (that might make sense!). I mean it would be as if General Motors had “boston.gm” and “tijuana.gm” and “tokyo.gm”. Nonesensical in a lot of ways (particularly now that the TLD’s can be bought for a song (well, an expensive song)).
Anyway, I was curious how other people did it, so I asked. As it turns out, this post originally aired in July of 2008. I would guess that I had a couple of hundred readers. That’s a good range of experience to draw from, but I wanted a more broad view, so I submitted it to slashdot. And it got on the front page.
Thanks to Slashdot, this entry originally received 43 comments, which is right around 30 more than the next most popular story at that point. I’ve had a lot of people tell me that they found me because of that front page article. I didn’t submit it to drive people to the blog; I really did want to hear what people were doing with their own networks. Driving people to the blog was a completely satisfactory side effect, though
Before you leave this page, make sure to check out the original and read the comments. There’s a lot of funny (and interesting) ideas!
Enjoy!
Bob Plankers, over at The Lone Sysadmin wrote a couple days ago about getting busted while reading the wiki page on X-Men. He tried to cover it up by claiming to be researching future host names. Quick thinking, Bob. Good job!
It does bring up a good point, though. Internal naming schemes are something that everyone has an opinion on, and a load of suggestions.
At various places, I’ve used greek/roman gods, Simpsons characters, beer companies, wine labels, and fish.
At my current company, we used the beer and wine names. We absorbed another company that used fish. It worked fine for a while, but we grew in terms of servers and locations until it got unwieldy to remember A) all the names, and B) what each name did. You’d also start to get very similar names after a while. We’ve now got 4 physical locations, soon to be 5, and something like 50-60 servers (not counting network devices), no one would be able to keep them all straight (including the admin).
To improve the situation, we’re in the process of changing to location-based hostnames with a flat internal domain structure. For example, the 2ndary application server in Ohio is oh-app2, with the fake internal domain name trailing. The alpha site’s primary fileserver is a-fs1.
It’s no where near as fun as “wolverine.internal.com” but it certainly does tell you where you’re connecting to and what the machine does. What makes it interesting is when you go changing things like CVS repositories on people’s machines, mail servers, etc. The policy we’ve taken is to alias the old information to the new, and slowly phase out the old method.
What do you use as internal naming systems? What do you think would make an excellent scheme? Make sure to check the list to make sure it hasn’t been done before!
Posted in General
8 Comments »