Thread: Arguments Pro/Contra Software Raid
Hi, I've just had some discussion with colleagues regarding the usage of hardware or software raid 1/10 for our linux based database servers. I myself can't see much reason to spend $500 on high end controller cards for a simple Raid 1. Any arguments pro or contra would be desirable. From my experience and what I've read here: + Hardware Raids might be a bit easier to manage, if you never spend a few hours to learn Software Raid Tools. + There are situations in which Software Raids are faster, as CPU power has advanced dramatically in the last years and even high end controller cards cannot keep up with that. + Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not. + Using hardware controllers, the array becomes locked to a particular vendor. You can't switch controller vendors as the array meta information is stored proprietary. In case the Raid is broken to a level the controller can't recover automatically this might complicate manual recovery by specialists. + Even battery backed controllers can't guarantee that data written to the drives is consistent after a power outage, neither that the drive does not corrupt something during the involuntary shutdown / power irregularities. (This is theoretical as any server will be UPS backed) -- Regards, Hannes Dorbath
Hi Hannes, Hannes Dorbath a écrit : > Hi, > > I've just had some discussion with colleagues regarding the usage of > hardware or software raid 1/10 for our linux based database servers. > > I myself can't see much reason to spend $500 on high end controller > cards for a simple Raid 1. Naa, you can find ATA &| SATA ctrlrs for about EUR30 ! > Any arguments pro or contra would be desirable. > > From my experience and what I've read here: > > + Hardware Raids might be a bit easier to manage, if you never spend a > few hours to learn Software Raid Tools. I'd the same (mostly as you still have to punch a command line for most of the controlers) > + There are situations in which Software Raids are faster, as CPU power > has advanced dramatically in the last years and even high end controller > cards cannot keep up with that. Definitely NOT, however if your server doen't have a heavy load, the software overload can't be noticed (essentially cache managing and syncing) For bi-core CPUs, it might be true > + Using SATA drives is always a bit of risk, as some drives are lying > about whether they are caching or not. ?? Do you intend to use your server without a UPS ?? > + Using hardware controllers, the array becomes locked to a particular > vendor. You can't switch controller vendors as the array meta > information is stored proprietary. In case the Raid is broken to a level > the controller can't recover automatically this might complicate manual > recovery by specialists. ?? Do you intend not to make backups ?? > + Even battery backed controllers can't guarantee that data written to > the drives is consistent after a power outage, neither that the drive > does not corrupt something during the involuntary shutdown / power > irregularities. (This is theoretical as any server will be UPS backed) RAID's "laws": 1- RAID prevents you from loosing data on healthy disks, not from faulty disks, 1b- So format and reformat your RAID disks (whatever SCSI, ATA, SATA) several times, with destructive tests (see "-c -c" option from the mke2fs man) - It will ensure that disks are safe, and also make a kind of burn test (might turn to... days of formating!), 2- RAID doesn't prevent you from power suply brokeage or electricity breakdown, so use a (LARGE) UPS, 2b- LARGE UPS because HDs are the components that have the higher power consomption (a 700VA UPS gives me about 10-12 minutes on a machine with a XP2200+, 1GB RAM and a 40GB HD, however this fall to...... less than 25 secondes with seven HDs ! all ATA), 2c- Use server box with redudancy power supplies, 3- As for any sensitive data, make regular backups or you'll be as sitting duck. Some hardware ctrlrs are able to avoid the loss of a disk if you turn to have some faulty sectors (by relocating internally them); software RAID doesn't as sectors *must* be @ the same (linear) addresses. BUT a hardware controler is about EUR2000 and a (ATA/SATA) 500GB HD is ~ EUR350. That means you have to consider: * The server disponibility (time to change a power supply if no redudancies, time to exchange a not hotswap HD... In fact, how much down time you can "afford"), * The volume of the data (from which depends the size of the backup device), * The backup device you'll use (tape or other HDs), * The load of the server (and the number of simultaneous users => Soft|Hard, ATA/SATA|SCSI...), * The money you can spend in such a server * And most important, the color of your boss' tie the day you'll take the decision. Hope it will help you Jean-Yves
On 09.05.2006 12:10, Jean-Yves F. Barbier wrote: > Naa, you can find ATA &| SATA ctrlrs for about EUR30 ! Sure, just for my colleagues Raid Controller = IPC Vortex, which resides in that price range. > For bi-core CPUs, it might be true I've got that from pgsql.performance for multi-way opteron setups. > ?? Do you intend to use your server without a UPS ?? Sure there will be an UPS. I'm just trying to nail down the differences between soft- and hardware raid, regardless if they matter in the end :) > ?? Do you intend not to make backups ?? Sure we do backups, this all is more hypothetical thinking.. > Hope it will help you It has, thanks. -- Regards, Hannes Dorbath
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Hannes Dorbath wrote: > Hi, > > I've just had some discussion with colleagues regarding the usage of > hardware or software raid 1/10 for our linux based database servers. > > I myself can't see much reason to spend $500 on high end controller > cards for a simple Raid 1. > > Any arguments pro or contra would be desirable. > One pro and one con off the top of my head. Hotplug. Depending on your platform, SATA may or may not be hotpluggable (I know AHCI mode is the only one promising some kind of a hotplug, which means ICH6+ and Silicon Image controllers last I heard). SCSI isn't hotpluggable without the use of special hotplug backplanes and disks. You lose that in software RAID, which effectively means you need to shut the box down and do maintenance. Hassle. CPU. It's cheap. Much cheaper than your average hardware RAID card. For the 5-10% overhead usually imposed by software RAID, you can throw in a faster CPU and never even notice it. Most cases aren't CPU-bound anyways, or at least, most cases are I/O bound for the better part. This does raise the question of I/O bandwidth your standard SATA or SCSI controller comes with, though. If you're careful about that and handle hotplug sufficiently, you're probably never going to notice you're not running on metal. Kind regards, - -- Grega Bremec gregab at p0f dot net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFEYHRAfu4IwuB3+XoRA9jqAJ9sS3RBJZEurvwUXGKrFMRZfYy9pQCggGHh tLAy/YtHwKvhd3ekVDGFtWE= =vlyC -----END PGP SIGNATURE-----
On May 9, 2006, at 2:16 AM, Hannes Dorbath wrote: > Hi, > > I've just had some discussion with colleagues regarding the usage > of hardware or software raid 1/10 for our linux based database > servers. > > I myself can't see much reason to spend $500 on high end controller > cards for a simple Raid 1. > > Any arguments pro or contra would be desirable. > > From my experience and what I've read here: > > + Hardware Raids might be a bit easier to manage, if you never > spend a few hours to learn Software Raid Tools. > > + There are situations in which Software Raids are faster, as CPU > power has advanced dramatically in the last years and even high end > controller cards cannot keep up with that. > > + Using SATA drives is always a bit of risk, as some drives are > lying about whether they are caching or not. Don't buy those drives. That's unrelated to whether you use hardware or software RAID. > > + Using hardware controllers, the array becomes locked to a > particular vendor. You can't switch controller vendors as the array > meta information is stored proprietary. In case the Raid is broken > to a level the controller can't recover automatically this might > complicate manual recovery by specialists. Yes. Fortunately we're using the RAID for database work, rather than file storage, so we can use all the nice postgresql features for backing up and replicating the data elsewhere, which avoids most of this issue. > > + Even battery backed controllers can't guarantee that data written > to the drives is consistent after a power outage, neither that the > drive does not corrupt something during the involuntary shutdown / > power irregularities. (This is theoretical as any server will be > UPS backed) fsync of WAL log. If you have a battery backed writeback cache then you can get the reliability of fsyncing the WAL for every transaction, and the performance of not needing to hit the disk for every transaction. Also, if you're not doing that you'll need to dedicate a pair of spindles to the WAL log if you want to get good performance, so that there'll be no seeking on the WAL. With a writeback cache you can put the WAL on the same spindles as the database and not lose much, if anything, in the way of performance. If that saves you the cost of two additional spindles, and the space on your drive shelf for them, you've just paid for a reasonably proced RAID controller. Given those advantages... I can't imagine speccing a large system that didn't have a battery-backed write-back cache in it. My dev systems mostly use software RAID, if they use RAID at all. But my production boxes all use SATA RAID (and I tell my customers to use controllers with BB cache, whether it be SCSI or SATA). My usual workloads are write-heavy. If yours are read-heavy that will move the sweet spot around significantly, and I can easily imagine that for a read-heavy load software RAID might be a much better match. Cheers, Steve
On Tue, May 09, 2006 at 12:10:32 +0200, "Jean-Yves F. Barbier" <7ukwn@free.fr> wrote: > Naa, you can find ATA &| SATA ctrlrs for about EUR30 ! But those are the ones that you would generally be better off not using. > Definitely NOT, however if your server doen't have a heavy load, the > software overload can't be noticed (essentially cache managing and > syncing) It is fairly common for database machines to be IO, rather than CPU, bound and so the CPU impact of software raid is low. > Some hardware ctrlrs are able to avoid the loss of a disk if you turn > to have some faulty sectors (by relocating internally them); software > RAID doesn't as sectors *must* be @ the same (linear) addresses. That is not true. Software raid works just fine on drives that have internally remapped sectors.
On Tue, 2006-05-09 at 04:16, Hannes Dorbath wrote: > Hi, > > I've just had some discussion with colleagues regarding the usage of > hardware or software raid 1/10 for our linux based database servers. > > I myself can't see much reason to spend $500 on high end controller > cards for a simple Raid 1. > > Any arguments pro or contra would be desirable. > > From my experience and what I've read here: > > + Hardware Raids might be a bit easier to manage, if you never spend a > few hours to learn Software Raid Tools. Depends. Some hardware RAID cards aren't that easy to manage, and sometimes, they won't let you do some things that software will. I've run into situations where a RAID controller kicked out two perfectly good drives from a RAID 5 and would NOT accept them back. All data lost, and it would not be convinced to restart without formatting the drives first. arg! With Linux kernel sw RAID, I've had a similar problem pop up, and was able to make the RAID array take the drives back. Of course, this means that software RAID relies on you not being stupid, because it will let you do things that are dangerous / stupid. I found the raidtools on linux to be well thought out and fairly easy to use. > + There are situations in which Software Raids are faster, as CPU power > has advanced dramatically in the last years and even high end controller > cards cannot keep up with that. The only times I've found software RAID to be faster was against the hybrid hardware / software type RAID cards (i.e. the cheapies) or OLDER RAID cards, that have a 33 MHz coprocessor or such. Most modern RAID controllers have coprocessors running at several hundred MHz or more, and can compute parity and manage the array as fast as the attached I/O can handle it. The one thing a software RAID will never be able to match the hardware RAID controller on is battery backed cache. > + Using SATA drives is always a bit of risk, as some drives are lying > about whether they are caching or not. This is true whether you are using hardware RAID or not. Turning off drive caching seems to prevent the problem. However, with a RAID controller, the caching can then be moved to the BBU cache, while with software RAID no such option exists. Most SATA RAID controllers turn off the drive cache automagically, like the escalades seem to do. > + Using hardware controllers, the array becomes locked to a particular > vendor. You can't switch controller vendors as the array meta > information is stored proprietary. In case the Raid is broken to a level > the controller can't recover automatically this might complicate manual > recovery by specialists. And not just a particular vendor, but likely a particular model and even firmware revision. For this reason, and 24/7 server should have two RAID controllers of the same brand running identical arrays, then have them set up as a mirror across the controllers, assuming you have controllers that can run cooperatively. This setup ensures that even if one of your RAID controllers fails, you then have a fully operational RAID array for as long as it takes to order and replace the bad controller. And having a third as a spare in a cabinet somewhere is cheap insurance as well. > + Even battery backed controllers can't guarantee that data written to > the drives is consistent after a power outage, neither that the drive > does not corrupt something during the involuntary shutdown / power > irregularities. (This is theoretical as any server will be UPS backed) This may be theoretically true, but all the battery backed cache units I've used have brought the array up clean every time the power has been lost to them. And a UPS is no insurance against loss of power. Cascading power failures are not uncommon when things go wrong. Now, here's my take on SW versus HW in general: HW is the way to go for situations where a battery backed cache is needed. Heavily written / updated databases are in this category. Software RAID is a perfect match for databases with a low write to read ratio, or where you won't be writing enough for the write performance to be a big issue. Many data warehouses fall into this category. In this case, a JBOD enclosure with a couple of dozen drives and software RAID gives you plenty of storage for chicken feed. If the data is all derived from outside sources, then you can turn on the write cache in the drives and turn off fsync and it will be plenty fast, just not crash safe.
> > Don't buy those drives. That's unrelated to whether you use hardware > or software RAID. Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote: ("Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not.") >> Don't buy those drives. That's unrelated to whether you use hardware >> or software RAID. > > Sorry that is an extremely misleading statement. SATA RAID is > perfectly acceptable if you have a hardware raid controller with a > battery backup controller. If the drive says it's hit the disk and it hasn't then the RAID controller will have flushed the data from its cache (or flagged it as correctly written). At that point the only place the data is stored is in the non battery backed cache on the drive itself. If something fails then you'll have lost data. You're not suggesting that a hardware RAID controller will protect you against drives that lie about sync, are you? > > And dollar for dollar, SCSI will NOT be faster nor have the hard > drive capacity that you will get with SATA. Yup. That's why I use SATA RAID for all my databases. Cheers, Steve
On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > Sorry that is an extremely misleading statement. SATA RAID is > perfectly acceptable if you have a hardware raid controller with a > battery backup controller. > > And dollar for dollar, SCSI will NOT be faster nor have the hard > drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears.
Attachment
Vivek Khera <vivek@khera.org> writes: > On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > >> And dollar for dollar, SCSI will NOT be faster nor have the hard >> drive capacity that you will get with SATA. > > Does this hold true still under heavy concurrent-write loads? I'm > preparing yet another big DB server and if SATA is a better option, > I'm all (elephant) ears. Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. -Doug
Vivek Khera wrote: > > On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > >> Sorry that is an extremely misleading statement. SATA RAID is >> perfectly acceptable if you have a hardware raid controller with a >> battery backup controller. >> >> And dollar for dollar, SCSI will NOT be faster nor have the hard drive >> capacity that you will get with SATA. > > Does this hold true still under heavy concurrent-write loads? I'm > preparing yet another big DB server and if SATA is a better option, I'm > all (elephant) ears. I didn't say better :). If you can afford, SCSI is the way to go. However SATA with a good controller (I am fond of the LSI 150 series) can provide some great performance. I have not used, but have heard good things about Areca as well. Oh, and make sure they are SATA-II drives. Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
> You're not suggesting that a hardware RAID controller will protect > you against drives that lie about sync, are you? Of course not, but which drives lie about sync that are SATA? Or more specifically SATA-II? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote: > >> You're not suggesting that a hardware RAID controller will protect >> you against drives that lie about sync, are you? > > Of course not, but which drives lie about sync that are SATA? Or > more specifically SATA-II? SATA-II, none that I'm aware of, but there's a long history of dodgy behaviour designed to pump up benchmark results down in the consumer drive space, and low end consumer space is where a lot of SATA drives are. I wouldn't be surprised to see that beahviour there still. I was responding to the original posters assertion that drives lying about sync were a reason not to buy SATA drives, by telling him not to buy drives that lie about sync. You seem to have read this as "don't buy SATA drives", which is not what I said and not what I meant. Cheers, Steve
Douglas McNaught wrote: > Vivek Khera <vivek@khera.org> writes: > >> On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: >> >>> And dollar for dollar, SCSI will NOT be faster nor have the hard >>> drive capacity that you will get with SATA. >> Does this hold true still under heavy concurrent-write loads? I'm >> preparing yet another big DB server and if SATA is a better option, >> I'm all (elephant) ears. > > Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. Best I have seen is 10k but if I can put 4x the number of drives in the array at the same cost... I don't need 15k. Joshua D. Drake > > -Doug > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On Tue, 2006-05-09 at 12:52, Steve Atkins wrote: > On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote: > > ("Using SATA drives is always a bit of risk, as some drives are lying > about whether they are caching or not.") > > >> Don't buy those drives. That's unrelated to whether you use hardware > >> or software RAID. > > > > Sorry that is an extremely misleading statement. SATA RAID is > > perfectly acceptable if you have a hardware raid controller with a > > battery backup controller. > > If the drive says it's hit the disk and it hasn't then the RAID > controller > will have flushed the data from its cache (or flagged it as correctly > written). At that point the only place the data is stored is in the non > battery backed cache on the drive itself. If something fails then you'll > have lost data. > > You're not suggesting that a hardware RAID controller will protect > you against drives that lie about sync, are you? Actually, in the case of the Escalades at least, the answer is yes. Last year (maybe a bit more) someone was testing an IDE escalade controller with drives that were known to lie, and it passed the power plug pull test repeatedly. Apparently, the escalades tell the drives to turn off their cache. While most all IDEs and a fair number of SATA drives lie about cache fsyncing, they all seem to turn off the cache when you ask. And, since a hardware RAID controller with bbu cache has its own cache, it's not like it really needs the one on the drives anyway.
Joshua D. Drake wrote: > Vivek Khera wrote: > > > > On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > > > >> Sorry that is an extremely misleading statement. SATA RAID is > >> perfectly acceptable if you have a hardware raid controller with a > >> battery backup controller. > >> > >> And dollar for dollar, SCSI will NOT be faster nor have the hard drive > >> capacity that you will get with SATA. > > > > Does this hold true still under heavy concurrent-write loads? I'm > > preparing yet another big DB server and if SATA is a better option, I'm > > all (elephant) ears. > > I didn't say better :). If you can afford, SCSI is the way to go. > However SATA with a good controller (I am fond of the LSI 150 series) > can provide some great performance. Basically, you can get away with cheaper hardware, but it usually doesn't have the reliability/performance of more expensive options. You want an in-depth comparison of how a server disk drive is internally better than a desktop drive: http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Scott Marlowe wrote: > Actually, in the case of the Escalades at least, the answer is yes. > Last year (maybe a bit more) someone was testing an IDE escalade > controller with drives that were known to lie, and it passed the power > plug pull test repeatedly. Apparently, the escalades tell the drives to > turn off their cache. While most all IDEs and a fair number of SATA > drives lie about cache fsyncing, they all seem to turn off the cache > when you ask. > > And, since a hardware RAID controller with bbu cache has its own cache, > it's not like it really needs the one on the drives anyway. You do if the controller thinks the data is already on the drives and removes it from its cache. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote: > Of course not, but which drives lie about sync that are SATA? Or > more specifically SATA-II? I don't know the answer to this question, but have you seen this tool? http://brad.livejournal.com/2116715.html It attempts to experimentally determine if, with your operating system version, controller, and hard disk, fsync() does as claimed. Of course, experimentation can't prove the system is correct, but it can sometimes prove the system is broken. I say it's worth running on any new model of disk, any new controller, or after the Linux kernel people rewrite everything (i.e. on every point release). I have to admit to hypocrisy, though...I'm running with systems that other people ordered and installed, I doubt they were this thorough, and I don't have identical hardware to run tests on. So no real way to do this. Regards, Scott -- Scott Lamb <http://www.slamb.org/>
Douglas McNaught <doug@mcnaught.org> writes: > Vivek Khera <vivek@khera.org> writes: > > > On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > > > >> And dollar for dollar, SCSI will NOT be faster nor have the hard > >> drive capacity that you will get with SATA. > > > > Does this hold true still under heavy concurrent-write loads? I'm > > preparing yet another big DB server and if SATA is a better option, > > I'm all (elephant) ears. > > Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. Well, dollar for dollar you would get the best performance from slower drives anyways since it would give you more spindles. 15kRPM drives are *expensive*. -- greg
Steve Atkins <steve@blighty.com> writes: > On May 9, 2006, at 2:16 AM, Hannes Dorbath wrote: > > > Hi, > > > > I've just had some discussion with colleagues regarding the usage of > > hardware or software raid 1/10 for our linux based database servers. > > > > I myself can't see much reason to spend $500 on high end controller cards > > for a simple Raid 1. > > > > Any arguments pro or contra would be desirable. Really most of what's said about software raid vs hardware raid online is just FUD. Unless you're running BIG servers with so many drives that the raid controllers are the only feasible way to connect them up anyways, the actual performance difference will likely be negligible. The only two things that actually make me pause about software RAID in heavy production use are: 1) Battery backed cache. That's a huge win for the WAL drives on Postgres. 'nuff said. 2) Not all commodity controllers or IDE drivers can handle failing drives gracefully. While the software raid might guarantee that you don't actually lose data, you still might have the machine wedge because of IDE errors on the bad drive. So as far as runtime, instead of added reliability all you've really added is another point of failure. On the data integrity front you'll still be better off. -- Greg
> 2b- LARGE UPS because HDs are the components that have the higher power > consomption (a 700VA UPS gives me about 10-12 minutes on a machine > with a XP2200+, 1GB RAM and a 40GB HD, however this fall to...... > less than 25 secondes with seven HDs ! all ATA), I got my hands on a (free) 1400 VA APC rackmount UPS ; the batteries were dead so I stuck two car batteries in. It can power my computer (Athlon 64, 7 drives) for more than 2 hours... It looks ugly though. I wouldn't put this in a server rack, but for my home PC it's perfect. It has saved my work many times... Harddisks suck in about 15 watts each, but draw large current spikes on seeking, so the VA rating of the UPS is important. I guess in your case, the batteries have enough charge left; but the current capability of the UPS is exceeded. > Some hardware ctrlrs are able to avoid the loss of a disk if you turn > to have some faulty sectors (by relocating internally them); software > RAID doesn't as sectors *must* be @ the same (linear) addresses. Harddisks do transparent remapping now... linux soft raid can rewrite bad sectors with good data and the disk will remap the faulty sector to a good one.
Greg Stark <gsstark@mit.edu> writes: > Douglas McNaught <doug@mcnaught.org> writes: >> Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. > > Well, dollar for dollar you would get the best performance from slower drives > anyways since it would give you more spindles. 15kRPM drives are *expensive*. Depends on your power, heat and rack space budget too... If you need max performance out of a given rack space (rather than max density), SCSI is still the way to go. I'll definitely agree that SATA is becoming much more of a player in the server storage market, though. -Doug
* Hannes Dorbath: > + Hardware Raids might be a bit easier to manage, if you never spend a > few hours to learn Software Raid Tools. I disagree. RAID management is complicated, and once there is a disk failure, all kinds of oddities can occur which can make it quite a challenge to get back a non-degraded array. With some RAID controllers, monitoring is diffcult because they do not use the system's logging mechanism for reporting. In some cases, it is not possible to monitor the health status of individual disks. > + Using SATA drives is always a bit of risk, as some drives are lying > about whether they are caching or not. You can usually switch off caching. > + Using hardware controllers, the array becomes locked to a particular > vendor. You can't switch controller vendors as the array meta > information is stored proprietary. In case the Raid is broken to a > level the controller can't recover automatically this might complicate > manual recovery by specialists. It's even more difficult these days. 3ware controllers enable drive passwords, so you can't access the drive from other controllers at all (even if you could interpret the on-disk data). > + Even battery backed controllers can't guarantee that data written to > the drives is consistent after a power outage, neither that the drive > does not corrupt something during the involuntary shutdown / power > irregularities. (This is theoretical as any server will be UPS backed) UPS failures are not unheard of. 8-/ Apart from that, you can address a large class of shutdown failures if you replay a log stored in the BBU on the next reboot (partial sector writes come to my mind). It is very difficult to check if the controller does this correctly, though. A few other things to note: You can't achieve significant port density with non-RAID controllers, at least with SATA. You need to buy a RAID controller anyway. You can't quite achieve what a BBU does (even if you've got a small, fast persistent storage device) because there's no host software support for such a configuration.
Hi, Scott & all, Scott Lamb wrote: > I don't know the answer to this question, but have you seen this tool? > > http://brad.livejournal.com/2116715.html We had a simpler tool inhouse, which wrote a file byte-for-byte, and called fsync() after every byte. If the number of fsyncs/min is higher than your rotations per minute value of your disks, they must be lying. It does not find as much liers as the script above, but it is less intrusive (can be ran on every low-io machine without crashing it), and it found some liers in-house (some notebook disks, one external USB/FireWire to IDE case, and an older linux cryptoloop implementations, IIRC). If you're interested, I can dig for the C source... HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Hi, Scott & all, > > Scott Lamb wrote: > > > I don't know the answer to this question, but have you seen this tool? > > > > http://brad.livejournal.com/2116715.html > > We had a simpler tool inhouse, which wrote a file byte-for-byte, and > called fsync() after every byte. > > If the number of fsyncs/min is higher than your rotations per minute > value of your disks, they must be lying. > > It does not find as much liers as the script above, but it is less Why does it find fewer liers? --------------------------------------------------------------------------- > intrusive (can be ran on every low-io machine without crashing it), and > it found some liers in-house (some notebook disks, one external > USB/FireWire to IDE case, and an older linux cryptoloop implementations, > IIRC). > > If you're interested, I can dig for the C source... > > HTH, > Markus > > > > > -- > Markus Schaber | Logical Tracking&Tracing International AG > Dipl. Inf. | Software Development GIS > > Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi, Bruce, Bruce Momjian wrote: >>It does not find as much liers as the script above, but it is less > > Why does it find fewer liers? It won't find liers that have a small "lie-queue-length" so their internal buffers get full so they have to block. After a small burst at start which usually hides in other latencies, they don't get more throughput than spindle turns. It won't find liers that first acknowledge to the host, and then immediately write the block before accepting other commands. This improves latency (which is measured in some benchmarks), but not syncs/write rate. Both of them can be captured by the other script, but not by my tool. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote: > Scott Marlowe wrote: > > Actually, in the case of the Escalades at least, the answer is yes. > > Last year (maybe a bit more) someone was testing an IDE escalade > > controller with drives that were known to lie, and it passed the power > > plug pull test repeatedly. Apparently, the escalades tell the drives to > > turn off their cache. While most all IDEs and a fair number of SATA > > drives lie about cache fsyncing, they all seem to turn off the cache > > when you ask. > > > > And, since a hardware RAID controller with bbu cache has its own cache, > > it's not like it really needs the one on the drives anyway. > > You do if the controller thinks the data is already on the drives and > removes it from its cache. Bruce, re-read what I wrote. The escalades tell the drives to TURN OFF THEIR OWN CACHE.
Scott Marlowe <smarlowe@g2switchworks.com> writes: > On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote: >> You do if the controller thinks the data is already on the drives and >> removes it from its cache. > > Bruce, re-read what I wrote. The escalades tell the drives to TURN OFF > THEIR OWN CACHE. Some ATA drives would lie about that too IIRC. Hopefully they've stopped doing it in the SATA era. -Doug
Hi, Bruce, Markus Schaber wrote: >>>It does not find as much liers as the script above, but it is less >>Why does it find fewer liers? > > It won't find liers that have a small "lie-queue-length" so their > internal buffers get full so they have to block. After a small burst at > start which usually hides in other latencies, they don't get more > throughput than spindle turns. I just reread my mail, and must admit that I would not understand what I wrote above, so I'll explain a little more: My test programs writes byte-for-byte. Let's say our FS/OS has 4k page- and blocksize, that means 4096 writes that all write the same disk blocks. Intelligent liers will see that the the 2nd and all further writes obsolete the former writes who still reside in the internal cache, and drop those former writes from cache, effectively going up to 4k writes/spindle turn. Dumb liers will keep the obsolete writes in the write cache / queue, and so won't be caught by my program. (Note that I have no proof that such disks actually exist, but I have enough experience with hardware that I won't be surprised.) HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
On Wed, 2006-05-10 at 09:51, Douglas McNaught wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > > On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote: > > >> You do if the controller thinks the data is already on the drives and > >> removes it from its cache. > > > > Bruce, re-read what I wrote. The escalades tell the drives to TURN OFF > > THEIR OWN CACHE. > > Some ATA drives would lie about that too IIRC. Hopefully they've > stopped doing it in the SATA era. Ugh. Now that would make for a particularly awful bit of firmware implementation. I'd think that if I found a SATA drive doing that I'd be likely to strike the manufacturer off of the list for possible future purchases...
On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote: > Of course not, but which drives lie about sync that are SATA? Or more > specifically SATA-II? With older Linux drivers (before spring 2005, I think) - all of them - since it seems the linux kernel didn't support the write barriers needed to force the sync. It's not clear to me how much of the SATA data loss is due to this driver issue and how much is due to buggy drives themselves. According to Jeff Garzik (the guy who wrote the SATA drivers for Linux) [1] "You need a vaguely recent 2.6.x kernel to support fsync(2) and fdatasync(2) flushing your disk's write cache. Previous 2.4.x and 2.6.x kernels would only flush the write cache upon reboot, or if you used a custom app to issue the 'flush cache' command directly to your disk. Very recent 2.6.x kernels include write barrier support, which flushes the write cache when the ext3 journal gets flushed to disk. If your kernel doesn't flush the write cache, then obviously there is a window where you can lose data. Welcome to the world of write-back caching, circa 1990. If you are stuck without a kernel that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI) command, it is trivial to write a userspace utility that issues the command. Jeff, the Linux SATA driver guy " I've wondered for a while if this driver issue is actually the source of most of the fear around SATA drives. Note it appears that with those old kernels you aren't that safe with SCSI either. [1] in may 2005, http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114
On Tue, May 09, 2006 at 08:59:55PM -0400, Bruce Momjian wrote: > Joshua D. Drake wrote: > > Vivek Khera wrote: > > > > > > On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: > > > > > >> Sorry that is an extremely misleading statement. SATA RAID is > > >> perfectly acceptable if you have a hardware raid controller with a > > >> battery backup controller. > > >> > > >> And dollar for dollar, SCSI will NOT be faster nor have the hard drive > > >> capacity that you will get with SATA. > > > > > > Does this hold true still under heavy concurrent-write loads? I'm > > > preparing yet another big DB server and if SATA is a better option, I'm > > > all (elephant) ears. > > > > I didn't say better :). If you can afford, SCSI is the way to go. > > However SATA with a good controller (I am fond of the LSI 150 series) > > can provide some great performance. > > Basically, you can get away with cheaper hardware, but it usually > doesn't have the reliability/performance of more expensive options. > > You want an in-depth comparison of how a server disk drive is internally > better than a desktop drive: > > http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf BTW, someone (Western Digital?) is now offering SATA drives that carry the same MTBF/warranty/what-not as their SCSI drives. I can't remember if they actually claim that it's the same mechanisms just with a different controller on the drive... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, May 09, 2006 at 12:10:32PM +0200, Jean-Yves F. Barbier wrote: > > I myself can't see much reason to spend $500 on high end controller > > cards for a simple Raid 1. > > Naa, you can find ATA &| SATA ctrlrs for about EUR30 ! And you're likely getting what you paid for: crap. Such a controller is less likely to do things like turn of write caching so that fsync works properly. > > + Hardware Raids might be a bit easier to manage, if you never spend a > > few hours to learn Software Raid Tools. > > I'd the same (mostly as you still have to punch a command line for > most of the controlers) Controllers I've seen have some kind of easy to understand GUI, at least during bootup. When it comes to OS-level tools that's going to vary widely. > > + There are situations in which Software Raids are faster, as CPU power > > has advanced dramatically in the last years and even high end controller > > cards cannot keep up with that. > > Definitely NOT, however if your server doen't have a heavy load, the > software overload can't be noticed (essentially cache managing and > syncing) > > For bi-core CPUs, it might be true Depends. RAID performance depends on a heck of a lot more than just CPU. Software RAID allows you to do things like spread load across multiple controllers, so you can scale a lot higher for less money. Though in this case I doubt that's a consideration, so what's more important is that making sure the controller bus isn't in the way. One thing that means is ensuring that every SATA drive has it's own dedicated controller, since a lot of SATA hardware can't handle multiple commands on the bus at once. > > + Using SATA drives is always a bit of risk, as some drives are lying > > about whether they are caching or not. > > ?? Do you intend to use your server without a UPS ?? Have you never heard of someone tripping over a plug? Or a power supply failing? Or the OS crashing? If fsync is properly obeyed, PostgreSQL will gracefully recover from all of those situations. If it's not, you're at risk of losing the whole database. > > + Using hardware controllers, the array becomes locked to a particular > > vendor. You can't switch controller vendors as the array meta > > information is stored proprietary. In case the Raid is broken to a level > > the controller can't recover automatically this might complicate manual > > recovery by specialists. > > ?? Do you intend not to make backups ?? Even with backups this is still a valid concern, since the backup will be nowhere near as up-to-date as the database was unless you have a pretty low DML rate. > BUT a hardware controler is about EUR2000 and a (ATA/SATA) 500GB HD > is ~ EUR350. Huh? You can get 3ware controllers for about $500, and they're pretty decent. While I'm sure there are controllers for $2k that doesn't mean there's nothing inbetween that and nothing. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
>> You want an in-depth comparison of how a server disk drive is internally >> better than a desktop drive: >> >> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > BTW, someone (Western Digital?) is now offering SATA drives that carry > the same MTBF/warranty/what-not as their SCSI drives. I can't remember > if they actually claim that it's the same mechanisms just with a > different controller on the drive... Well western digital and Seagate both carry 5 year warranties. Seagate I believe does on almost all of there products. WD you have to pick the right drive. Joshua D> Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On Thu, May 11, 2006 at 03:38:31PM -0700, Joshua D. Drake wrote: > > >>You want an in-depth comparison of how a server disk drive is internally > >>better than a desktop drive: > >> > >> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > >BTW, someone (Western Digital?) is now offering SATA drives that carry > >the same MTBF/warranty/what-not as their SCSI drives. I can't remember > >if they actually claim that it's the same mechanisms just with a > >different controller on the drive... > > Well western digital and Seagate both carry 5 year warranties. Seagate I > believe does on almost all of there products. WD you have to pick the > right drive. I know that someone recently made a big PR push about how you could get 'server reliability' in some of their SATA drives, but maybe now everyone's starting to do it. I suspect the premium you can charge for it offsets the costs, provided that you switch all your production over rather than trying to segregate production lines. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Joshua D. Drake wrote: > > >> You want an in-depth comparison of how a server disk drive is internally > >> better than a desktop drive: > >> > >> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > > BTW, someone (Western Digital?) is now offering SATA drives that carry > > the same MTBF/warranty/what-not as their SCSI drives. I can't remember > > if they actually claim that it's the same mechanisms just with a > > different controller on the drive... > > Well western digital and Seagate both carry 5 year warranties. Seagate I > believe does on almost all of there products. WD you have to pick the > right drive. That's nice, but it seems similar to my Toshiba laptop drive experience --- it breaks, we replace it. I would rather not have to replace it. :-) Let me mention the only drive that has ever failed without warning was a SCSI Deskstar (deathstar) drive, which was a hybrid because it was a SCSI drive, but made for consumer use. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
>> Well western digital and Seagate both carry 5 year warranties. Seagate I >> believe does on almost all of there products. WD you have to pick the >> right drive. > > That's nice, but it seems similar to my Toshiba laptop drive experience > --- it breaks, we replace it. I would rather not have to replace it. :-) Laptop drives are known to have short lifespans do to heat. I have IDE drives that have been running for four years without any issues but I have good fans blowing over them. Frankly I think if you are running drivess (in a production environment) for more then 3 years your crazy anyway :) > > Let me mention the only drive that has ever failed without warning was a > SCSI Deskstar (deathstar) drive, which was a hybrid because it was a > SCSI drive, but made for consumer use. > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > > >> Well western digital and Seagate both carry 5 year warranties. Seagate I > >> believe does on almost all of there products. WD you have to pick the > >> right drive. > > > > That's nice, but it seems similar to my Toshiba laptop drive experience > > --- it breaks, we replace it. I would rather not have to replace it. :-) > > Laptop drives are known to have short lifespans do to heat. I have IDE > drives that have been running for four years without any issues but I > have good fans blowing over them. > > Frankly I think if you are running drivess (in a production environment) > for more then 3 years your crazy anyway :) Agreed --- the cost/benefit of keeping a drive >3 years just doesn't make sense. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, May 11, 2006 at 07:20:27PM -0400, Bruce Momjian wrote: > Joshua D. Drake wrote: > > > > >> You want an in-depth comparison of how a server disk drive is internally > > >> better than a desktop drive: > > >> > > >> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > > > > BTW, someone (Western Digital?) is now offering SATA drives that carry > > > the same MTBF/warranty/what-not as their SCSI drives. I can't remember > > > if they actually claim that it's the same mechanisms just with a > > > different controller on the drive... > > > > Well western digital and Seagate both carry 5 year warranties. Seagate I > > believe does on almost all of there products. WD you have to pick the > > right drive. > > That's nice, but it seems similar to my Toshiba laptop drive experience > --- it breaks, we replace it. I would rather not have to replace it. :-) > > Let me mention the only drive that has ever failed without warning was a > SCSI Deskstar (deathstar) drive, which was a hybrid because it was a > SCSI drive, but made for consumer use. My damn powerbook drive recently failed with very little warning, other than I did notice that disk activity seemed to be getting a bit slower. IIRC it didn't log any errors or anything. Even if it did, if the OS was catching them I'd hope it would pop up a warning or something. But from what I've heard, some drives now-a-days will silently remap dead sectors without telling the OS anything, which is great until you've used up all of the spare sectors and there's nowhere to remap to. :( Hmm... I should figure out how to have OS X email me daily log updates like FreeBSD does... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby wrote: > On Thu, May 11, 2006 at 07:20:27PM -0400, Bruce Momjian wrote: > > Joshua D. Drake wrote: > > > > > > >> You want an in-depth comparison of how a server disk drive is internally > > > >> better than a desktop drive: > > > >> > > > >> http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf > > > > > > > > BTW, someone (Western Digital?) is now offering SATA drives that carry > > > > the same MTBF/warranty/what-not as their SCSI drives. I can't remember > > > > if they actually claim that it's the same mechanisms just with a > > > > different controller on the drive... > > > > > > Well western digital and Seagate both carry 5 year warranties. Seagate I > > > believe does on almost all of there products. WD you have to pick the > > > right drive. > > > > That's nice, but it seems similar to my Toshiba laptop drive experience > > --- it breaks, we replace it. I would rather not have to replace it. :-) > > > > Let me mention the only drive that has ever failed without warning was a > > SCSI Deskstar (deathstar) drive, which was a hybrid because it was a > > SCSI drive, but made for consumer use. > > My damn powerbook drive recently failed with very little warning, other > than I did notice that disk activity seemed to be getting a bit slower. > IIRC it didn't log any errors or anything. Even if it did, if the OS was > catching them I'd hope it would pop up a warning or something. But from > what I've heard, some drives now-a-days will silently remap dead sectors > without telling the OS anything, which is great until you've used up all > of the spare sectors and there's nowhere to remap to. :( Yes, I think most IDE drives do silently remap, and most SCSI drives don't. Not sure how much _most_ is. I know my SCSI controller beeps at me when I try to access a bad block. Now, that gets my attention. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
> Hmm... I should figure out how to have OS X email me daily log updates > like FreeBSD does... Logwatch. -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On Thu, May 11, 2006 at 18:41:25 -0500, "Jim C. Nasby" <jnasby@pervasive.com> wrote: > On Thu, May 11, 2006 at 07:20:27PM -0400, Bruce Momjian wrote: > > My damn powerbook drive recently failed with very little warning, other > than I did notice that disk activity seemed to be getting a bit slower. > IIRC it didn't log any errors or anything. Even if it did, if the OS was > catching them I'd hope it would pop up a warning or something. But from > what I've heard, some drives now-a-days will silently remap dead sectors > without telling the OS anything, which is great until you've used up all > of the spare sectors and there's nowhere to remap to. :( You might look into smartmontools. One part of this is a daemon that runs selftests on the disks on a regular basis. You can have warnings mailed to you on various conditions. Drives will fail the self test before they run out of spare sectors. There are other drive characteristics that can be used to tell if drive failure is imminent and give you a chance to replace a drive before it fails.
> My damn powerbook drive recently failed with very little warning It seems to me that S.M.A.R.T. reporting is a crock of shit. I've had ATA drives report everything OK while clearly in the final throes of death, just minutes before total failure. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Scott Ribe <scott_ribe@killerbytes.com> writes: >> My damn powerbook drive recently failed with very little warning > It seems to me that S.M.A.R.T. reporting is a crock of shit. I've had ATA > drives report everything OK while clearly in the final throes of death, just > minutes before total failure. FWIW, I replaced a powerbook's drive about two weeks ago myself, and its SMART reporting didn't show a darn thing wrong either. Fortunately, the drive started acting noticeably weird (long pauses seemingly trying to recalibrate itself) while still working well enough that I was able to get everything copied off it. I didn't wait for it to fail completely ;-) regards, tom lane
At 11:53 AM 5/12/2006 -0400, Tom Lane wrote: >Scott Ribe <scott_ribe@killerbytes.com> writes: > >> My damn powerbook drive recently failed with very little warning > > > It seems to me that S.M.A.R.T. reporting is a crock of shit. I've had ATA > > drives report everything OK while clearly in the final throes of death, > just > > minutes before total failure. > >FWIW, I replaced a powerbook's drive about two weeks ago myself, and its >SMART reporting didn't show a darn thing wrong either. Fortunately, the >drive started acting noticeably weird (long pauses seemingly trying to >recalibrate itself) while still working well enough that I was able to >get everything copied off it. I didn't wait for it to fail completely ;-) Strange. With long pauses, usually you'd see stuff like "crc" errors in the logs, and you'd get some info from the SMART monitoring stuff. I guess a lot of it depends on the drive model and manufacturer. SMART reporting is better than nothing, and it's actually not too bad. It's just whether manufacturers implement it in useful ways or not. I wouldn't trust the drive or manufacturer's judgement on when failure is imminent - the drive usually gathers statistics etc and these are typically readable with the SMART monitoring/reporting software, so you should check those stats and decide for yourself when failure is imminent. For example: I'd suggest regarding any non-cable related CRC errors, or seek failures as "drive replacement time"- even if the drive or Manufacturer thinks you need to have tons in a row for "failure imminent". I recommend "blacklisting" drives which don't notice anything before it is too late. e.g. even if it starts taking a long time to read a block, it reports no differences in the SMART stats. Link.