…but not really to bury it, either.
Edit Just as warning, I have ignored many aspects of RAID-level choices here, and concentrated on that of the likelihood of catastrophic failure. There are lots of reasons to not select RAID-5 in a new array, but a comprehensive review of all aspects of the technology is way beyond the scope of a single blog entry. Take this entry for what it is: an explanation of a particular phenomenon of failure, and what makes it more likely or not. Don’t base your decision solely on this article, but let it be what it was meant, as a piece of the puzzle.
RAID-5 is misunderstood by too many people.
It’s cyclical, to be sure. We started off with no one knowing what RAID was, then we went to everyone knowing what RAID was (or at least, being familiar enough with it to mistake it for being a backup). RAID-5 was sweet, because it had a parity stripe, and if you had a bunch of drives shoved together, it could tolerate the loss of a drive. How awesome was that?
Then Robin Harris went and wrote Why RAID-5 Stops Working in 2009. It was an excellent article, and it was exactly right for the purposes that Robin meant it. It’s been a good way to point out to people that maybe they shouldn’t have RAID-5 configurations with 40TB worth of the cheapest SATA drives known to man.
But people started equating RAID-5 with being broken or bad. Or always wrong. And it isn’t. You can’t just take the shortcut and say that a technology like that is absolutely improper because it breaks down under certain circumstances. So I’m going to try to give you my perspective on this by recycling something I put together for an individual who had questions about when to use RAID-5 and when not to. I hope this helps clarify things.
Starting with the basics…hard disk drives that are not flash based operate by having small “heads” move back and forth across spinning platters (usually made of ceramic these days, but there are still metal ones floating around). Think of it like a record player arm where the needle is the head.
The head can read or write, depending on its instructions. Each hard drive has multiple platters, with one head per platter. Good so far?
OK, so the drive writes data and reads data…but it’s reading incredibly small magnetic data from the platters. You know how sometimes the magnetic cards in your wallet get erased from being around magnets? Those bits are HUGE compared to the ones and zeros on a hard drive platter.
The bits in the drive are not stored in your wallet, of course; they’re inside a metal case which is inside of your computer…but occasionally strange things will happen, like maybe something bumped the disk while the head was reading or writing, and it creates a gouge on the surface of the disk (remember, the platters are spinning at 5,400RPM at least!), or maybe dust got into the drive and is blocking the sector, or maybe even cosmic rays have flipped a bit on the platter (really!).
Whatever it is, something stops the head from being able to read the data on the disk. This is called an Unrecoverable Read Error (or URE for short).
The likelihood of encountering a URE is dependent on things like the build quality of the head and the speed, but mostly on the size of the sectors on the disk (remember how your magnetic stripe has big bits? those are harder to overwrite).
Manufacturers often figure out how likely a URE is for a particular drive, and they usually put it in the documentation for that drive. You can find it if you go to Pricewatch and pick a hard drive at random, then look up the product manual for that part number. Here’s an example of the Western Digital Green Line, or maybe you’d prefer a Seagate Barracuda.
Anyway, get the product manual, and look in the specs for something like “Non-recoverable read errors per bits read”, or “Non-recoverable read errors”. What this gives you is the likelihood of your encountering a URE.
Both of those examples have “1 per 1014 bits read”. If we use the handy-dandy Google calculator, you can see that it really means one URE per 11.3 terabytes.
So, putting all of this together…if you have a 3TB drive and you fill it to capacity, you could probably read from it over 3 times completely before you’d encounter a URE, and lose the data that was held by that particular sector.
That’s kind of unsettling, right? 3 full reads on a 3TB drive, then a high likelihood of going kaput on the 4th read-through?
Fortunately, we use RAID levels which can protect us.
Lets start with RAID-1, which is mirrored disks. We are reading along happily when suddenly we get a URE on one of the drives! It would be sad but, on the other drive, we have an exact copy of the data as it was originally written! The RAID controller reads the data from the other drive, re-writes it on the drive that had the URE, and we continue merrily along.
Suppose we lose a RAID-1 drive. That leaves us with a good copy, but assuming the array was full, when we put a replacement drive into the array, we’ve got to re-read all of the data on the good drive. Keeping in mind that we’re likely to experience a failure after 11.3TB, what is the statistical probability that we’ll experience a URE reading 3TB? Around 1 in 4.
Now, lets move on to RAID-5. You need at least 3 drives in a RAID-5, because unlike the exact copy of RAID-1, RAID-5 has a parity, so that any individual pieces of data can be lost, and instead of recovering the data by copying it, it’s recalculated by examining the remaining data bits.
So when we encounter a URE during normal RAID operations, the array calculates what the missing data was, the data is re-written so we’ll have it next time, and the array carries on business as usual. But when a drive dies, we have to replace it, and that’s when things get hairy.
In order to rebuild the array, the new drive needs to be populated, and in order to do that, the entire contents of the remaining drives need to be read, in order to calculate the parity information. Assuming we have a RAID-5 array that has 3 3-TB disks, we’re now reading 6 terabytes of information. What is the statistical likelihood of encountering a URE?
1 in 2. A coin-flip.
Have a RAID-5 array with 4 3-TB disks? That’s 1 in 1, almost certainly a failure. You can see how quickly this goes downhill.
Now, a lot of people see this, freak out, and say “oh my god, I’m never using RAID-5 again! RAID-5 is the devil! It’s EVIL!”, but remember what is driving the numbers…it’s the URE rate.
Check out this Hitachi Ultrastar. It’s URE rate is 1:1016 which is a kind-of-amazing 1.1 petabytes…so the odds of your UltraStar-based RAID-5 array dying during a rebuild because of a URE is…very low.
So you can’t vilify RAID-5 across the board. It’s very much a matter of what you’re using it for, what the quality of the drives involved are, and what the capacity of the array is (or really, how much data is stored on the array).
Does that help you understand? Have questions, comments, or suggestions? Please leave them below, thanks!