July 11, 2011
I think I ruffled some feathers yesterday with a twitter post, where I put up the following message:
I got a lot of replies from people who were understandably curious why I would say that, so I'm writing this blog post as a justification. Hopefully by the end of it, you'll see why I'm concerned.
To begin with, let's make sure we're all on the same page.
In the dark ages...
Linux development began in 1991 by Linus Torvalds as a clone of the Minix kernel. As you may know, and contrary to popular usage, "Linux" refers to the kernel, not to the operating system itself. The kernel acts as an interface between the hardware and the "user land" software, as well as performs memory management, and several other important, but low level functions.
The "operating system" that uses Linux has almost always primarily consisted of core UNIX-like utilities provided by GNU, as well as a widely-differing array of other software, depending on who was deciding what to put in.
Development of the Linux kernel (and in most cases, the other freely available software that came packaged in it) were developed almost entirely by volunteers who's only payment for their time was having more free software available to use.
As time went on, several arrangements of these operating systems became more widely used. These individual arrangements of software were called "Linux distributions", since the underlying kernel was Linux, and it was just the user-land software being distributed was what varied (as well as certain configuration changes, such as where certain files and directories were stored).
The most widely used distributions were Slackware, Debian, and RedHat. You probably recognize these because they're still in active development, albeit with certain modifications in some cases.
As all of the software was Free and Open Source, the distributions themselves didn't cost anything either, although there were businesses which sold Linux distributions on media, since in the mid-1990's home bandwidth was usually limited to 50Kb/s or so, and downloading CD images was time prohibitive.
Beginnings of Support
In August of 1995, Red Hat announced the release of Red Hat Commercial Linux 1.1 which cost $39.95 and included support with installation (although additional support was available, presumably for an additional cost). This was something of a novel idea in the Linux world, and a good number of Open Source advocates were not happy with the idea of spending money for free software.
In any event, this opened the doors for wider use of Linux in corporations which required software to have vendor support. This isn't to say that Linux usage exploded because of this move, however it allowed expansion into a new niche, and served as the thin end of a wedge for Red Hat to charge for Linux distributions in exchange for support.
The Modern World
Flash forward to today, and there are several commercially available Linux distributions, many of which are based on Redhat and Debian. Red Hat has a widely successful Linux release called Red Hat Enterprise Linux (hereafter referred to as RHEL). This is the distribution that I'm primarily concerned with at the moment, because it is essentially the standard that the company I worked for based our infrastructure.
Redhat Enterprise provides a stable environment with corporate support. New releases and software updates are thoroughly tested, and reliability is paramount. It is also not cheap. That being said, the few times I had to call support, they solved my issues quickly and without any data loss.
If you are a corporation and are capable of paying for enterprise support (and you have a business case for it), then you could do worse than buying Red Hat Enterprise Linux. That was the reason that I initially bought several versions for my infrastructure. As it turned out, though, the support costs wore on us over time, and we began to investigate other options.
It's hard to switch out of an existing ecosystem, especially after spending any time in it. You start to create things with the assumption that the environment is a certain way (this was before I learned about the environment abstraction capable with things like puppet and CFengine). I had already made the switch when transitioning between Slackware and RHEL, and I was not eager to do it again.
Imagine my elation when I found out that there was a freely-available clone of RHEL, and that I could essentially swap out one installation for another without changing much (if anything) in my scripts! Yes, of course, I wouldn't have support, but we'd determined that there was a business reason for not paying the rates RH asked, so I wasn't concerned about that (and at this point, I'd been a Linux jockey for about 10 years, so I was fairly confident).
That was how I felt when I found CentOS. It is essentially a binary-compatible release of RHEL. Because the software comprising RHEL is Free / Open Source Software (FOSS) under the GNU license (mostly, I believe), any changes that Red Hat makes must be distributed as well. So the makers of CentOS take the source code, remove any Red Hat logos or other copyrighted material, then repackage it as the Community ENTerprise OS.
The idea is great, and I jumped in fully. When I left my company, almost every server was running CentOS. The installation is rock-solid, and I never had a situation where I even wished I had support.
There are certain things that, with an enterprise infrastructure, you need to count on. The one that I've got to call into question right now is timeliness.
Software development is a process, and it takes time. When software is updated, it gets released by the authors, and at that point, it is a candidate for inclusion in a distribution. Only the really bleeding-edge distributions include software as soon as it's released by the authors, though. Most wait a period of time for bug reports to come in and fixes to be released. Because RHEL is a commercial distribution with an emphasis on stability, it doesn't usually include new software releases as updates at all. Instead, it typically only accepts newer minor versions of the already-stable software that the particular version of the distribution shipped with. And each of those are tested for potential problems.
CentOS, by its very nature, is always playing catch-up. This isn't really a big problem, in and of itself, because in an enterprise infrastructure, it's not a good idea to apply updates willy-nilly anyway (even if they HAVE been tested by the authors, bug testers, users, and distribution publishers). If you test updates in a lab environment, then roll out updates according to a schedule, your odds of encountering a show-stopping error are heavily decreased.
Unfortunately, there are certain conditions where having up-to-date software isn't just a nicety, but is required. Some of those conditions are when the software you want to use is dependent upon a version of software that your distribution provider hasn't released, or when a security hole is discovered and is patched by the author of the software in question.
In those cases, CentOS puts us in a bit of a hard place, due to the very nature of their product. It's a valid argument that if you absolutely need security updates, then paying for commercial support makes sense, but there are a lot of us in the "leave it" side of "lump it or leave it" argument - mostly because our companies CAN'T afford to pay the support costs and still be profitable.
For that reason, we are at the mercy of our distribution provider to be timely with the software releases, and rather than spend money that we don't have to pay for software updates, we have to be particular about the sources of software that we install on our machines.
I do appreciate CentOS and all of the people who put work into it. They are doing a service for the community and should be lauded for their efforts. That being said, as system administrators, we need to be pragmatic and make decisions with the best interests of our infrastructures in mind, not our ideals.
Here's a graph showing the time delay (in days) from when Red Hat releases a version upgrade to when the CentOS release happens:
Clearly, there has recently been a staggering delay in regards to the 6.0 release, but even excluding that, the trend is clearly a rising length in delays. I'm not involved with the CentOS project, so I don't know what's causing this. There are a lot of factors, I'm sure, not to mention that Red Hat has not made it easier on them, despite assurances that CentOS wasn't being targeted by those actions.
On the mostly-non-technical side of things, there have also been some wrinkles in CentOS. At one point in time, the leading team member went MIA. Granted, things eventually settled down, and to their credit, there hasn't been any news of that sort since then (at least, that I'm aware of).
When I look at the 10,000ft view, I do have misgivings. I wish there was a greater transparency into the process. I wish I knew if they needed more resources (and if so, which ones). I guess it's frustrating to me because I don't know, and I feel like I need to know.
At the very least, when any product displays issues like CentOS has had, you owe it to yourself and your users to examine the alternatives. Fortunately, there are some.
The leading competitor to CentOS is Scientific Linux, a recompilation of RHEL put together by CERN, Fermilab, and other labs around the world. I have not experimented with it much yet, but I have the ISO and I'm only waiting on enough uninterrupted time to get a feel for it. Before you play with it, you might take a look at the Scientific Linux Customizations that they've made.
I checked the delay in days that SL has been subject to, and compared them to CentOS, and got the following graph:
As you can see, both have had delays as of late. Again, I wish I knew if this were just the inherent complexity in the task (which has to be nontrivial), or if there were other problems.
There are others, as well. The wiki page RHEL Derivatives lists a good number, but I've got no experience (and have heard next to nothing about many).
In the end, we have three options. We can pay for RHEL, or we can switch to something and hope it proves itself stable, or we can stay with CentOS and hope it stays stable. I don't know what the right answer is, honestly. If I could see into the future, I'd be doing something much more lucrative than system administration, though! If I can leave you with one word of advice, it would be to construct your infrastructure such that the underlying operating system is as unimportant as possible. It won't be possible to completely insulate yourself, of course, because even with the best configuration management solution, you still need to update patched software, but at least with an abstracted infrastructure, you can migrate away from a particular vendor if or when they prove themselves to be unreliable.
What is your take on this? Where do you see the future of free Enterprise Linux? Are you making contingencies? If so, please let us know!
You may also find the following article interesting reading: "The Rise and Fall of CentOS". I hate to ruin the ending, but the ship sinks and Dag Wieers resigns from the devel mailing list (which I did not hear about until just now), though his repo still supports CentOS, so at least those of us with machines using it won't be left out in the cold.