April 27, 2011
2 years ago, I asked for advice on WAN Acceleration. Let it not be said that I never follow up on anything ;-)
Even with all of the happenings and goings on, I'm still doing actual work somehow, and one of my projects is to evaluate some WAN optimizers for our NYC office, since it seems likely that all of our employees will once again be in the same office before the end of the year, and even at the new site, 20-30 people browsing huge file shares, running huge SQL queries, and so on will eat up the available bandwidth without much effort.
To stem the flood of traffic, our CEO approved my suggestion that we look into WAN optimization strategies that would let us expand throughput without a matching growth of continual expenses on bandwidth. I looked at Riverbed's Steelhead appliances, and they came in several versions of expensive, but the real deal breakers for me were that the appliances were licensed based on the number of sessions supported at one time. As it turns out, in addition to the network data deduplication, they also have application-specific enhancements.
What that last point means is that certain applications have protocol-specific enhancements to increase speed and responsiveness, so say, CIFS file shares fly, and Exchange mailboxes go really quickly.  These TCP-level protocol enhancements use things like window sizes, as well as upper-layer protocol mimicry (if that's the right word?) to leverage the fact that the clients are really making connections locally, rather than to remote hosts, like they think they are. By the way, thanks to Jwiz for setting me right in the comments. I had mistakenly believed that Riverbed didn't do compression for non-enhanced protocols, but that is not the case.
So, after asking around on twitter, I got in touch with Silver Peak, who make several solutions for the space. Although they don't do the TCP-protocol tricks I talked with the sales and engineering guys, and they agreed to send me out some demo units to see if I would be able to see any improvement in my networks. They came yesterday, so today I spent an hour or two on the phone with the technician getting them configured. I'm hoping to have them in place before the end of next week.
The whole idea of WAN optimization is pretty cool. It's essentially network-based deduplication. You have a box at one end of a connection and a box on the other end of the connection, each functioning (in my case, anyway) as an invisible bridge between the switch and the gateway router. Here's an overview of how the two will function:
A user requests something from a server. The server fills the request, and sends the data back to the user. Meanwhile, the WAN accelerators at each site are watching this happen, and caching the data to their hard drives. This way, when a second user requests the same data, the remote server receives the request, and if the data is unchanged, then instead of the server-side machine sending the whole file again, instead it sends a reference to it (or really, a reference to the byte streams that make up the file) with the understanding that the references are smaller than the contents they're referring to. This means that, in effect, the optimizers are talking on their own, saying "Hey, Bob wants X, Y, and Z bytes, but you've already got them. Send them from your side instead", which makes the data transfer the equivalent of two LAN transfers, rather than a WAN transfer.
If you're unsure of why this is a good thing, make sure to read this Numbers Everyone Should Know post carefully.
As you can see, the WAN accelerators are in-line. When the engineers first described this to me, I was a bit concerned. I did not want a failed box to cause me downtime. He explained that the ethernet cards in the box are designed to fail open, and thus revert back to bridge mode. Very cool. I tested it, and sure enough, even without the power plugged in, traffic passes. There's an audible click when the power is brought up or down, because apparently there's a relay inside that triggers. In any event, it makes me feel better, even though I'm pretty sure that 100% of the failure modes aren't known. I didn't have time to talk with the engineer about that, but I'll be asking more questions as I go through testing.
The one part of the experience that wasn't great was the server that came with the boxes and collates statistics and the like. It's loud. Loud, like, wow, I can't be in the same room as this, loud. It's a rebranded SuperMicro (at least, it's virtually identical to a SiliconMechanics machine that I bought a while back, and they use SuperMicro, to the best of my knowledge). Anyway, it's loud. Really loud. Unfortunately, the shared-space basement in that building isn't the ideal place to put a multi-thousand-dollar machine you have on loan, but it looks like that's where it's going to have to live for now. I'm not happy about it, but I didn't make the call. It should be fine until we actually buy it.
I'll make sure to write about what I find. I've already been playing with the statistics, and there's a ton of good information available from the machine, even if it isn't actually doing any compressing yet. I can't wait to see how it does when it actually starts improving our throughput.