General update and a long weekend ahead

After a week of wrestling with CDW and EMC, two weeks of fighting the storage array, and coming up with an ad hoc environment in something like 2 hours, I’ve had a rough go of this whole backup-site-activation thing.

The latest wrinkle has been that although EMC shipped us the storage processor, the burned CD with an updated FlareOS was corrupt. You would think, “oh, just download it from the website”. At least, that’s what I thought. But no. Apparently I’m not special enough to get into that section or something, so I called support, explained the situation, and they told me in their most understanding voice that I had to talk to my sales contact. /sigh

So I called my CDW rep, explained the situation, and he said that he’d get right on it! Excellent. That was, I believe, Tuesday, a bit after the 2nd blog entry. Well, yesterday at 4:30pm he told me that he finally talked to the right people at EMC, and that they’d ship the CD out to me so that I would have it this morning. My reaction might be described as “cautiously optimistic”.

Could it be that I finally get to install the 2nd storage processor? Maybe! If I do, it’s going to make for a long, long weekend. The EMC docs say that the installation itself takes 6 hours with the software install and reinitialization of the processors. If that last sentence sounds ominous to you, too, it really just means that the software on the controllers gets erased and reinstalled, not the data on the SAN. At least that’s what they tell me. I’m going to be extremely unhappy if that’s the case.

Tune in next week for the next exciting installment of “How can Matt be screwed by his own ignorance”!

The god of storage hates me, I know it

It seems like storage and I never get along. There’s always some difficulty somewhere. It’s always that I don’t have enough, or I don’t have enough where I need it, and there’s always the occasional sorry-we-sold-you-a-single-controller followed by I’ll-overnight-you-another-one which appears to be concluded by sorry-it-won’t-be-there-until-next-week. /sigh

So yes, looking back at my blog’s RSS feed, it was Wednesday of last week that I discovered the problem was the lack of a 2nd storage controller, and it was that same day that we ordered another controller. We asked for it to be overnighted. Apparently overnight is 6 days later, because it should come today. I mean, theoretically, it might not, but hey, I’m an optimist. Really.

Assuming that it does come today, I’m driving to Philadelphia to install it into the chassis. If it doesn’t come, I’m driving to Philadelphia to install another server into the rack, because we promised operations that they’d have a working environment by Wednesday, then I’m going again whenever the part comes.

In almost-offtopic news, I am quickly becoming a proponent of the “skip a rack unit between equipment” school of rack management. You see, there are people like me who shove all of the equipment together so that they can maintain a chunk of extra free space in the rack in case something big comes along. Then there are people who say that airflow and heat dissipation are no good when the servers are like that, so they leave one rack unit between their equipment.

I’ve got blades, so skipping a RU wouldn’t do much for my heat dissipation, but my 2nd processor kit is coming with a 1u pair of battery backups for the storage array and I REALLY wish that I hadn’t put the array on the bottom of the rack and left the nearest free space about 15 units above it. I’m going to have to do some rearranging, and I’m not sure what I can move yet.

Security is a process and not plug&play

I got a SANS pamphlet in the mail today, which makes me feel guilty. Not really guilty, as in “I should go but I’m not” (even though I should, and I’m not), but because in terms of IT security, I’ve sort of been in the “Oh, I’m sure that’ll be fine while I’m doing all of this other stuff” mode. It’s not a good practice to be in, but I don’t see any way to give IT security the attention it deserves when all (and I mean all) of my free time is spent building new infrastructure and stopping the existing infrastructure from falling apart. And if you don’t believe me,

[email protected]:~$ ps aux | grep Eterm | wc -l

That’s not counting the VMs that are installing right now, or the VM diagram I’m using to keep track of which physical machine will be getting what virtual machine.

I cringe whenever I think about this phrase, but I don’t have enough time to worry about security. The automatic response to that (even from/to myself) is “do you have enough time to clean up a break in?”. I’m not monitoring logs like I want, and I don’t even have enough time to set up a log monitoring system to do it for me. I’m hoping that in a few weeks things will relax and I can start putting emphasis where it should be, but it isn’t right now. I really need more staff to give proper types of attention to security, various Oracle, Postgres, and MySQL databases, site buildouts, asset management, user support, and backups, but I don’t have it, so I find myself juggling all of those various tasks, and my stress level is directly related to how many balls are in the air at one time.

Looking through the SANS booklet, I see all kinds of classes that I’d love to take (the Network PenTest / Ethical Hacking class, for one) but I can’t even foresee enough free time to take the class, let alone utilize it.

Have any of you ever been to a SANS conference and received training? Was it worth it? How did you get to use it back at your job? Cheer me up and regale me with stories of success from conference training ;-)