Re: Raid 10 chunksize - Mailing list pgsql-performance

From david@lang.hm
Subject Re: Raid 10 chunksize
Date
Msg-id alpine.DEB.1.10.0904011555510.28893@asgard.lang.hm
Whole thread Raw
In response to Re: Raid 10 chunksize  (Scott Carey <scott@richrelevance.com>)
List pgsql-performance
On Wed, 1 Apr 2009, Scott Carey wrote:

> On 4/1/09 9:54 AM, "Scott Marlowe" <scott.marlowe@gmail.com> wrote:
>
>> On Wed, Apr 1, 2009 at 10:48 AM, Stef Telford <stef@ummon.com> wrote:
>>> Scott Marlowe wrote:
>>>> On Wed, Apr 1, 2009 at 10:15 AM, Stef Telford <stef@ummon.com> wrote:
>>>>
>>>>>     I do agree that the benefit is probably from write-caching, but I
>>>>> think that this is a 'win' as long as you have a UPS or BBU adaptor,
>>>>> and really, in a prod environment, not having a UPS is .. well. Crazy ?
>>>>>
>>>>
>>>> You do know that UPSes can fail, right?  En masse sometimes even.
>>>>
>>> Hello Scott,
>>>    Well, the only time the UPS has failed in my memory, was during the
>>> great Eastern Seaboard power outage of 2003. Lots of fond memories
>>> running around Toronto with a gas can looking for oil for generator
>>> power. This said though, anything could happen, the co-lo could be taken
>>> out by a meteor and then sync on or off makes no difference.
>>
>> Meteor strike is far less likely than a power surge taking out a UPS.
>> I saw a whole data center go black when a power conditioner blew out,
>> taking out the other three power conditioners, both industrial UPSes
>> and the switch for the diesel generator.  And I have friends who have
>> seen the same type of thing before as well.  The data is the most
>> expensive part of any server.
>>
> Yeah, well I?ve had a RAID card die, which broke its Battery backed cache.
> They?re all unsafe, technically.
>
> In fact, not only are battery backed caches unsafe, but hard drives.  They
> can return bad data.  So if you want to be really safe:
>
> 1: don't use Linux -- you have to use something with full data and metadata
> checksums like ZFS or very expensive proprietary file systems.

this will involve other tradeoffs

> 2: combine it with mirrored SSD's that don't use write cache (so you can
> have fsync perf about as good as a battery backed raid card without that
> risk).

they _all_ have write caches. a beast like you are looking for doesn't
exist

> 4: keep a live redundant system with a PITR backup at another site that can
> recover in a short period of time.

a good option to keep in mind (and when the new replication code becomes
available, that will be even better)

> 3: Run in a datacenter well underground with a plutonium nuclear power
> supply.  Meteor strikes and Nuclear holocaust, beware!

at some point all that will fail

but you missed point #5 (in many ways a more important point than the
others that you describe)

switch from using postgres to using a database that can do two-phase
commits across redundant machines so that you know the data is safe on
multiple systems before the command is considered complete.

David Lang

pgsql-performance by date:

Previous
From: Scott Carey
Date:
Subject: Re: Raid 10 chunksize
Next
From: Scott Marlowe
Date:
Subject: Re: Raid 10 chunksize