Re: H800 + md1200 Performance problem - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: H800 + md1200 Performance problem
Date
Msg-id CAHyXU0xxuT4+qk4uDkTKpLjOTMtHnnNaaEDyUxM2PDsp_nneRQ@mail.gmail.com
Whole thread Raw
In response to Re: H800 + md1200 Performance problem  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: H800 + md1200 Performance problem  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-performance
On Thu, Apr 5, 2012 at 10:49 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 5.4.2012 17:17, Cesar Martin wrote:
>> Well, I have installed megacli on server and attach the results in file
>> megacli.txt. Also we have "Dell Open Manage" install in server, that can
>> generate a log of H800. I attach to mail with name lsi_0403.
>>
>> About dirty limits, I have default values:
>> vm.dirty_background_ratio = 10
>> vm.dirty_ratio = 20
>>
>> I have compared with other server and values are the same, except in
>> centos 5.4 database production server that have vm.dirty_ratio = 40
>
> Do the other machines have the same amount of RAM? The point is that the
> values that work with less memory don't work that well with large
> amounts of memory (and the amount of RAM did grow a lot recently).
>
> For example a few years ago the average amount of RAM was ~8GB. In that
> case the
>
>  vm.dirty_background_ratio = 10  => 800MB
>  vm.dirty_ratio = 20 => 1600MB
>
> which is all peachy if you have a decent controller with a write cache.
> But turn that to 64GB and suddenly
>
>  vm.dirty_background_ratio = 10  => 6.4GB
>  vm.dirty_ratio = 20 => 12.8GB
>
> The problem is that there'll be a lot of data waiting (for 30 seconds by
> default), and then suddenly it starts writing all of them to the
> controller. Such systems behave just as your system - short strokes of
> writes interleaved with 'no activity'.
>
> Greg Smith wrote a nice howto about this - it's from 2007 but all the
> recommendations are still valid:
>
>  http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>
> TL;DR:
>
>  - decrease the dirty_background_ratio/dirty_ratio (or use *_bytes)
>
>  - consider decreasing the dirty_expire_centiseconds

The original problem is read based performance issue though and this
will not have any affect on that whatsoever (although it's still
excellent advice).  Also dd should bypass the o/s buffer cache.  I
still pretty much convinced that there is a fundamental performance
issue with the raid card dell needs to explain.

merlin

pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: H800 + md1200 Performance problem
Next
From: Ants Aasma
Date:
Subject: Re: bad plan