Re: Intel SSDs that may not suck - Mailing list pgsql-performance
From | Greg Smith |
---|---|
Subject | Re: Intel SSDs that may not suck |
Date | |
Msg-id | 4D9222B9.9020109@2ndQuadrant.com Whole thread Raw |
In response to | Re: Intel SSDs that may not suck (Yeb Havinga <yebhavinga@gmail.com>) |
List | pgsql-performance |
On 03/29/2011 06:34 AM, Yeb Havinga wrote: > While I appreciate the heads up about these new drives, your posting > suggests (though you formulated in a way that you do not actually say > it) that OCZ products do not have a long term reliability. No factual > data. If you have knowledge of sandforce based OCZ drives fail, that'd > be interesting because that's the product line what the new Intel SSD > ought to be compared with. I didn't want to say anything too strong until I got to the bottom of the reports I'd been sorting through. It turns out that there is a very wide incompatibility between OCZ drives and some popular Gigabyte motherboards: http://www.ocztechnologyforum.com/forum/showthread.php?76177-do-you-own-a-Gigabyte-motherboard-and-have-the-SMART-error-with-FW1.11...look-inside (I'm typing this message on a system with one of the impacted combinations, one reason why I don't own a Vertex 2 Pro yet. That I would have to run a "Beta BIOS" does not inspire confidence.) What happens on the models impacted is that you can't get SMART data from the drive. That means no monitoring for the sort of expected failures we all know can happen with any drive. So far that looks to be at the bottom of all the anecdotal failure reports I'd found: the drives may have been throwing bad sectors or some other early failure, and the owners had no idea because they thought SMART would warn them--but it wasn't working at all. Thus, don't find out there's a problem until the drive just dies altogether one day. More popular doesn't always mean more reliable, but for stuff like this it helps. Intel ships so many more drives than OCZ that I'd be shocked if Gigabyte themselves didn't have reference samples of them for testing. This really looks like more of a warning about why you should be particularly aggressive with checking SMART when running recently introduced drives, which it sounds like you are already doing. Reliability in this area is so strange...a diversion to older drives gives an idea how annoyed I am about all this. Last year, I gave up on Western Digital's consumer drives (again). Not because the failure rates were bad, but because the one failure I did run into was so terrible from a SMART perspective. The drive just lied about the whole problem so aggressively I couldn't manage the process. I couldn't get the drive to admit it had a problem such that it could turn into an RMA candidate, despite failing every time I ran an aggressive SMART error check. It would reallocate a few sectors, say "good as new!", and then fail at the next block when I re-tested. Did that at least a dozen times before throwing it in the "pathological drives" pile I keep around for torture testing. Meanwhile, the Seagate drives I switched back to are terrible, from a failure percentage perspective. I just had two start to go bad last week, both halves of an array which is always fun. But, the failure started with very clearly labeled increases in reallocated sectors, and the drive that eventually went really bad (making the bad noises) was kicked back for RMA. If you've got redundancy, I'll take components that fail cleanly over ones that hide what's going on, even if the one that fails cleanly is actually more likely to fail. With a rebuild always a drive swap away, having accurate data makes even a higher failure rate manageable. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
pgsql-performance by date: