Re: New server: SSD/RAID recommendations? - Mailing list pgsql-performance

From Graeme B. Bell
Subject Re: New server: SSD/RAID recommendations?
Date
Msg-id E3ACE58A-4E08-4D32-845A-00352834B1F7@skogoglandskap.no
Whole thread Raw
In response to Re: New server: SSD/RAID recommendations?  ("Graeme B. Bell" <graeme.bell@nibio.no>)
List pgsql-performance
> 
> This raises another interesting question. Does anyone hear have a document explaining how their BBU cache works
EXACTLY(at cache / sata level) on their server? Because I haven't been able to find any for mine (Dell PERC
H710/H710P).Can anyone tell me with godlike authority and precision, what exactly happens inside that BBU post-power
failure?


(and if you have that manual - how can you know it's accurate? that the implementation matches the manual and is free
ofbugs? because my M500s didn't match the packaging and neither did a  H710 we bought - Dell had advertised features in
somemarketing material that were only present on the H710P)
 

And I see UBER (unrecoverable bit error) rates for SSDs and HDDs, but has anyone ever seen them for the flash-based
cacheon their raid controller?
 

Sleep well, friends.

Graeme. 

On 07 Jul 2015, at 18:54, Graeme B. Bell <graeme.bell@nibio.no> wrote:

> 
> That is a very good question, which I have raised elsewhere on the postgresql lists previously.
> 
> In practice: I have *never* managed to make diskchecker fail with the BBU enabled in front of the drives and I spent
daystrying with plug pulls till I reached the point where as a statistical event it just can't be that likely at all.
That'snot to say it can't ever happen, just that I've taken all reasonable measures that I can to find out on the time
andmoney budget I had available. 
 
> 
> In theory: It may be the fact the BBU makes the drives run at about half speed, so that the capacitors go a good bit
furtherto empty the cache, after all: without the BBU in the way, the drive manages to save everything but the last
fragmentof writes. But I also suspect that the controller itself maybe replaying the last set of writes from around the
timeof power loss. 
 
> 
> Anyway I'm 50/50 on those two explanations. Any other thoughts welcome. 
> 
> This raises another interesting question. Does anyone hear have a document explaining how their BBU cache works
EXACTLY(at cache / sata level) on their server? Because I haven't been able to find any for mine (Dell PERC
H710/H710P).Can anyone tell me with godlike authority and precision, what exactly happens inside that BBU post-power
failure?
> 
> There is rather too much magic involved for me to be happy.
> 
> G
> 
> On 07 Jul 2015, at 18:27, Vitalii Tymchyshyn <vit@tym.im> wrote:
> 
>> Hi.
>> 
>> How would BBU cache help you if it lies about fsync? I suppose any RAID controller removes data from BBU cache after
itwas fsynced by the drive. As I know, there is no other "magic command" for drive to tell controller that the data is
safenow and can be removed from BBU cache.
 
>> 
>> Вт, 7 лип. 2015 11:59 Graeme B. Bell <graeme.bell@nibio.no> пише:
>> 
>> Yikes. I would not be able to sleep tonight if it were not for the BBU cache in front of these disks...
>> 
>> diskchecker.pl consistently reported several examples of corruption post-power-loss (usually 10 - 30 ) on
unprotectedM500s/M550s, so I think it's pretty much open to debate what types of madness and corruption you'll find if
youlook close enough.
 
>> 
>> G
>> 
>> 
>> On 07 Jul 2015, at 16:59, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> 
>>> 
>>> So it lies about fsync()... The next question is, does it nevertheless enforce the correct ordering of persisting
fsync'ddata? If you write to file A and fsync it, then write to another file B and fsync it too, is it guaranteed that
ifB is persisted, A is as well? Because if it isn't, you can end up with filesystem (or database) corruption anyway.
 
>>> 
>>> - Heikki
>> 
>> 
>> 
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
> 


pgsql-performance by date:

Previous
From: Wei Shan
Date:
Subject: Re: New server: SSD/RAID recommendations?
Next
From: Karl Denninger
Date:
Subject: Re: New server: SSD/RAID recommendations?