Re: Reliability with RAID 10 SSD and Streaming Replication - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Reliability with RAID 10 SSD and Streaming Replication
Date
Msg-id 519A8E29.60702@fuzzy.cz
Whole thread Raw
In response to Re: Reliability with RAID 10 SSD and Streaming Replication  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: Reliability with RAID 10 SSD and Streaming Replication
List pgsql-performance
On 20.5.2013 05:00, Greg Smith wrote:
> On 5/16/13 8:06 PM, Tomas Vondra wrote:
>> Have you considered using a UPS? That would make the SSDs about as
>> reliable as SATA/SAS drives - the UPS may fail, but so may a BBU unit on
>> the SAS controller.
>
> That's not true at all.  Any decent RAID controller will have an option
> to stop write-back caching when the battery is bad.  Things will slow
> badly when that happens, but there is zero data risk from a short-term
> BBU failure.  The only serious risk with a good BBU setup are that
> you'll have a power failure lasting so long that the battery runs down
> before the cache can be flushed to disk.

That's true, no doubt about that. What I was trying to say is that a
controller with BBU (or a SSD with proper write cache protection) is
about as safe as an UPS when it comes to power outages. Assuming both
are properly configured / watched / checked.

Sure, there are scenarios where UPS is not going to help (e.g. a PSU
failure) so a controller with BBU is better from this point of view.
I've seen crashes with both options (BBU / UPS), both because of
misconfiguration and hw issues. BTW I don't know what controller are we
talking about here - it might be as crappy as the SSD drives.

What I was thinking about in this case is using two SSD-based systems
with UPSes. That'd allow fast failover (which may not be possible with
the SAS based replica, as it does not handle the load).

But yes, I do agree that the provider should be ashamed for not
providing reliable SSDs in the first place. Getting reliable SSDs should
be the first option - all these suggestions are really just workarounds
of this rather simple issue.

Tomas


pgsql-performance by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: statistics target for columns in unique constraint?
Next
From: Merlin Moncure
Date:
Subject: Re: Reliability with RAID 10 SSD and Streaming Replication