IANA Finally Out of IPv4 Addresses

Some people thought it would never happen, but APNIC has received the two triggering blocks in IPv4. With the assignment of these two blocks, the five remaining will be divided among each of the five regional registries. From the article:

Please be aware, this will be the final allocation made by IANA under the current framework and will trigger the final distribution of five /8 blocks, one to each RIR under the agreed “Global policy for the allocation of the remaining IPv4 address space“.

After these final allocations, each RIR will continue to make allocations according to their own established policies.

APNIC, the Asian/Pacific regional registrar, has plans to dole these out over the next five years. They’re also winding up with 3 /8 blocks (the two assigned, plus the one they get in the allocation agreement). The other registrars won’t have as many addresses to give, and so I’m certain that they won’t last so long.

Is it obvious yet?

Timing is Key

I don’t know if you’ve heard, but my area in New Jersey has gotten a lot of snow lately. Right now, I’m looking at the back parking lot, and there’s snow around two feet deep everywhere that isn’t plowed or covered by 12 foot plow droppings. Earlier this week, we were at work as we started to get the most recent snowfall, and I was trying to encourage my junior admin to leave the office before it hit. “I don’t know, it looks alright”, he said, and I replied,

If you wait until it’s obvious, it’s too late

It sort of hit me. That last sentence has become my defacto motto when it comes to a lot of things. I think the first time it occurred to me was when I started researching IPv6. The depletion of IPv4 is no surprise, and hasn’t been for quite a while, but it seems like most people are holding off even researching it until it becomes obvious that they need it. Again, by that time, if you’re in any kind of competitive company vying for market position, it’ll be too late. It won’t be obvious that it’s necessary until you see people financially punished for not taking those steps.

Another thing I’ve noticed within my own company is that so much of the concentration is on day to day operations that it’s rare when someone steps back and thinks, “is this really a sustainable practice?”. It’s getting better since we created a QA role, but still, the first thing I do after putting out a fire is to figure out why something wasn’t made flame-retardant a long time ago.

I’m going through practices that we have on paper but aren’t performing (like enforced data lifetimes), plus I’m implementing things that will help the environment scale further, such as sharding my data sets. Only through continuing to manage these things toward the direction I want them to go can I prevent future disasters. If I wait until it’s obvious that there’s a problem, then it’s too late.

I know that I’m not the only one. My friend Robin Harris published a column back in 2007 titled Why RAID 5 stops working in 2009 which basically explained that the rate of unrecoverable read errors (UREs) wasn’t decreasing at the same rate that disk capacity was increasing, and that it was soon going to be common that the size of your RAID array would nearly guarantee a URE when a drive fails. He was right; right now, no one would recommend RAID-5 for an array of any size.

Last year, he published a follow-up article, “Why RAID-6 stops working in 2019“. He’s right again, if not sooner.

Our jobs are hard. They take up a lot of time, but the day to day requirements of user and machine maintenance are only part of the story. We need to keep an eye out for oncoming trucks, especially when no one else is doing that job for our company.

What kinds of things do you see looming? What have you noticed but others have missed? Share your thoughts in the comments, because you might open someone’s eyes to a problem they never knew was sneaking up on them.

Replacing my load balancers with cheaper solutions

Did I say cheaper? I really meant free, because cheap is too expensive…

I don’t have a big budget. In fact, if you want to get really technical, I don’t have a budget at all – I just sort of have to ask management for everything I want to buy. I’m really hoping that changes this year, but the point is that right now, it’s really hard for me to spend any money. This is inconvenient at times like the present, because I’ve got a critical piece of infrastructure at my backup site that has started a death spiral.

To give you some back story, the application that we use to provide our clients’ information to them is written in Java and delivered via tomcat. We’ve got multiple application (read: tomcat) servers, and we wanted to be able to provide high availability failover. Load balancing was secondary, due to the limited volume of traffic we get, but it was vital that the application be available.

The solution that we decided on about 4 years ago was a Kemp Loadmaster 1500. I’d link to the actual device, but it’s so old that it’s been EOL’d and doesn’t exist on their website anymore.

The original LM1500 we got served us well…so well, in fact, that we bought two more, which we installed into our production site, and moved the original to the secondary site. This is the situation that we’re currently in, and the original is now starting to die.

Of course, since it’s been years, the original is no longer under a service contract (I’ve recently been in discussions with management about whether it’s cheaper all the way around to purchase extended service contracts or replace infrastructure, but that’s another blog post). This means that the support team at Kemp is sympathetic, but ultimately unwilling to help, except for having a sales guy call me to sell me a new set of load balancers.

This has the advantage of putting me in a position to make choices. It’s unfortunate that all of my choices are unpleasant, though.

On one hand, I suppose that I could continue indefinitely in having the colocation staff trudge out to my rack and cold reboot the load balancer twice a day. Or I could beg management for a couple thousand dollars to replace this load balancer (then twice whatever that amount is in another year, when the production load balancers fail). Or I could find another solution.

Assuming you’re not insane or a masochist (Oh who am I kidding? You’re a sysadmin, you’re a masochist), you picked the last option, which is also the one I picked (despite my insanity AND my masochism), and I went looking for solutions.

Because I firmly believe that all of us are smarter than any of us, I used the hivemind to find my answer, and I put out this twitter message:

I was actually kind of surprised at the sheer volume of the responses I got. There were dozens of suggestions, but by far, the most frequent was HAproxy. Their list of people who use it is impressive, but moreso was the fact that every time I searched for something related to it, the author, Willy Tarreau was commenting and offering advice. It’s really great to see someone that in tune with the users of his products.

After spending a bit educating myself on how load balancers work in general, and how HAproxy works specifically, I’ve got a running configuration with NGiNX doing HTTPS and doing a proxy handoff to HAproxy, which balances between application servers (and also performs mail and FTP proxying, too). The configuration isn’t where I want it to be, but I’ll be playing with it in the coming days, and when it’s in a closer-to-final form, I’m going to post another entry with the configuration. After all of that is done, I’ll be moving on to a HA load balancer situation, most likely using keepalived.

If you’re new to the whole idea of load balancers, you could do much worse than reading this document written by Willy Tarreau about making applications scale with load balancers. It’s pretty enlightening if you haven’t spent much time in the arena. Also, because the client access is only part of the picture, if you want to learn more about making your infrastructure go fast, I really really recommend John Allspaw’s The Art of Capacity Planning, which is an amazingly great book not only because it’s full of very useful information, but because it makes you want to optimize your infrastructure.

Until next time, I’m back to the load balancer salt mines to produce a good working configuration…