Re: Optimal settings for RAID controller - optimized for writes - Mailing list pgsql-performance

From KONDO Mitsumasa
Subject Re: Optimal settings for RAID controller - optimized for writes
Date
Msg-id 53041ACC.9010208@lab.ntt.co.jp
Whole thread Raw
In response to Re: Optimal settings for RAID controller - optimized for writes  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: Optimal settings for RAID controller - optimized for writes  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-performance
(2014/02/19 5:41), Tomas Vondra wrote:
> On 18.2.2014 02:23, KONDO Mitsumasa wrote:
>> Hi,
>>
>> I don't have PERC H710 raid controller, but I think he would like to
>> know raid striping/chunk size or read/write cache ratio in
>> writeback-cache setting is the best. I'd like to know it, too:)
>
> The stripe size is actually a very good question. On spinning drives it
> usually does not matter too much - unless you have a very specialized
> workload, the 'medium size' is the right choice (AFAIK we're using 64kB
> on H710, which is the default).
I am interested that raid stripe size of PERC H710 is 64kB. In HP raid card,
default chunk size is 256kB. If we use two disks with raid 0, stripe size will
be 512kB. I think that it might too big, but it might be optimized in raid
card... In actually, it isn't bad in that settings.

I'm interested in raid card internal behavior. Fortunately, linux raid card
driver is open souce, so we might good at looking the source code when we have time.

> With SSDs this might actually matter much more, as the SSDs work with
> "erase blocks" (mostly 512kB), and I suspect using small stripe might
> result in repeated writes to the same block - overwriting one block
> repeatedly and thus increased wearout. But maybe the controller will
> handle that just fine, e.g. by coalescing the writes and sending them to
> the drive as a single write. Or maybe the drive can do that in local
> write cache (all SSDs have that).
I have heard that genuine raid card with genuine ssds are optimized in these
ssds. It is important that using compatible with ssd for performance. If the
worst case, life time of ssd is be short, and will be bad performance.


> But those are mostly speculations / curious questions I've been asking
> myself recently, as we've been considering SSDs with H710/H710p too.
>
> As for the controller cache - my opinion is that using this for caching
> writes is just plain wrong. If you need to cache reads, buy more RAM -
> it's much cheaper, so you can buy more of it. Cache on controller (with
> a BBU) is designed especially for caching writes safely. (And maybe it
> could help with some of the erase-block issues too?)
I'm wondering about effective of readahead in OS and raid card. In general,
readahead data by raid card is stored in raid cache, and not stored in OS caches.
Readahead data by OS is stored in OS cache. I'd like to use all raid cache for
only write cache, because fsync() becomes faster. But then, it cannot use
readahead very much by raid card.. If we hope to use more effectively, we have to
clear it, but it seems difficult:(

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center



pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Optimal settings for RAID controller - optimized for writes
Next
From: Merlin Moncure
Date:
Subject: Re: Optimal settings for RAID controller - optimized for writes