Re: New server setup - Mailing list pgsql-performance

From Karl Denninger
Subject Re: New server setup
Date
Msg-id 5140D5B3.7080104@denninger.net
Whole thread Raw
In response to Re: New server setup  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-performance

On 3/13/2013 2:23 PM, Steve Crawford wrote:
On 03/13/2013 09:15 AM, John Lister wrote:
On 13/03/2013 15:50, Greg Jaskiewicz wrote:
SSDs have much shorter life then spinning drives, so what do you do when one inevitably fails in your system ?
Define much shorter? I accept they have a limited no of writes, but that depends on load. You can actively monitor the drives "health" level...

What concerns me more than wear is this:

InfoWorld Article:
http://www.infoworld.com/t/solid-state-drives/test-your-ssds-or-risk-massive-data-loss-researchers-warn-213715

Referenced research paper:
https://www.usenix.org/conference/fast13/understanding-robustness-ssds-under-power-fault

Kind of messes with the "D" in ACID.

Cheers,
Steve

One potential way around this is to run ZFS as the underlying filesystem and use the SSDs as cache drives.  If they lose data due to a power problem it is non-destructive.

Short of that you cannot use a SSD on a machine where silent corruption is unacceptable UNLESS you know it has a supercap or similar IN THE DISK that guarantees that on-drive cache can be flushed in the event of a power failure.  A battery-backed controller cache DOES NOTHING to alleviate this risk.  If you violate this rule and the power goes off you must EXPECT silent and possibly-catastrophic data corruption.

Only a few (and they're expensive!) SSD drives have said protection.  If yours does not the only SAFE option is as I described up above using them as ZFS cache devices.

--
-- Karl Denninger
The Market Ticker ®
Cuda Systems LLC

pgsql-performance by date:

Previous
From: Niels Kristian Schjødt
Date:
Subject: Re: Setup of four 15k SAS disk with LSI raid controller
Next
From: CSS
Date:
Subject: Re: New server setup