Re: Intel SSDs that may not suck - Mailing list pgsql-performance

From Jeff
Subject Re: Intel SSDs that may not suck
Date
Msg-id 8BE8F356-319F-4BE7-BE66-F45D50235985@torgo.978.org
Whole thread Raw
In response to Re: Intel SSDs that may not suck  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Intel SSDs that may not suck  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
Re: Intel SSDs that may not suck  (Jeff <threshar@torgo.978.org>)
Re: Intel SSDs that may not suck  (Jesper Krogh <jesper@krogh.cc>)
Re: Intel SSDs that may not suck  (Scott Carey <scott@richrelevance.com>)
List pgsql-performance
On Mar 29, 2011, at 12:13 AM, Merlin Moncure wrote:

>
> My own experience with MLC drives is that write cycle expectations are
> more or less as advertised. They do go down (hard), and have to be
> monitored. If you are writing a lot of data this can get pretty
> expensive although the cost dynamics are getting better and better for
> flash. I have no idea what would be precisely prudent, but maybe some
> good monitoring tools and phased obsolescence at around 80% duty cycle
> might not be a bad starting point.  With hard drives, you can kinda
> wait for em to pop and swap em in -- this is NOT a good idea for flash
> raid volumes.



we've been running some of our DB's on SSD's (x25m's, we also have a
pair of x25e's in another box we use for some super hot tables).  They
have been in production for well over a year (in some cases, nearly a
couple years) under heavy load.

We're currently being bit in the ass by performance degradation and
we're working out plans to remedy the situation.  One box has 8 x25m's
in a R10 behind a P400 controller.  First, the p400 is not that
powerful and we've run experiments with newer (p812) controllers that
have been generally positive.   The main symptom we've been seeing is
write stalls.  Writing will go, then come to a complete halt for 0.5-2
seconds, then resume.   The fix we're going to do is replace each
drive in order with the rebuild occuring between each.  Then we do a
security erase to reset the drive back to completely empty (including
the "spare" blocks kept around for writes).

Now that all sounds awful and horrible until you get to overall
performance, especially with reads - you are looking at 20k random
reads per second with a few disks.  Adding in writes does kick it down
a noch, but you're still looking at 10k+ iops. That is the current
trade off.

In general, i wouldn't recommend the cciss stuff with SSD's at this
time because it makes some things such as security erase, smart and
other things near impossible. (performance seems ok though) We've got
some tests planned seeing what we can do with an Areca controller and
some ssds to see how it goes.

Also note that there is a funky interaction with an MSA70 and SSDs.
they do not work together. (I'm not sure if HP's official branded
ssd's have the same issue).

The write degradation could probably be monitored looking at svctime
from sar. We may be implementing that in the near future to detect
when this creeps up again.


--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




pgsql-performance by date:

Previous
From: Yeb Havinga
Date:
Subject: Re: Intel SSDs that may not suck
Next
From: Cédric Villemain
Date:
Subject: Re: Intel SSDs that may not suck