Re: WAL & ZFS - Mailing list pgsql-admin

From Rui DeSousa
Subject Re: WAL & ZFS
Date
Msg-id E43F5F43-456E-459D-B0D5-680F599097F9@crazybean.net
Whole thread Raw
In response to Re: WAL & ZFS  (Scott Ribe <scott_ribe@elevated-dev.com>)
Responses Re: WAL & ZFS  (Scott Ribe <scott_ribe@elevated-dev.com>)
List pgsql-admin

> On Apr 1, 2022, at 1:56 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
>
>> On Apr 1, 2022, at 11:49 AM, Rui DeSousa <rui@crazybean.net> wrote:
>>
>> If you’re using RAIDZ# then performance is going to be heavily impacted and I would highly recommend NOT using
RAIDZ#for a database server. 
>
> Actually, I found even the performance of RAIDZ1 to be acceptable after appropriate configuration--current versions,
lz4,etc. 
>

It might be for a low iops system; however, I would still recommend against it.  I haven’t used RAIDZ in years; it
mightbe good for an archive system but I don’t see the value of it in a production database server.  You also have to
accountfor drive failures and replacement time.  A replacement in a RAIDZ configuration is much more expense than
replacinga disk in a mirrored set.  Disks today are larger as well and the risk of another failure during a rebuild is
exponentialincreased thus the need for RAIDZ2 and RAIDZ3. 

Personally and for logical reasons I would build a RAIDZ in powers of 2;  i.e. 2, 4, 8 drives plus parity and then have
astripe set of RAIDZ2.  So the first option would require 4 drives (2D + 2P) and would have same storage as a RAID10
configuration;however, the RAID10 would perform better under load.  The 4+p option seems to be the sweet spot as the
rebuildtimes on larger sets are not worth it nor is it worth spreading out 128k over 8 drives - of course one could use
alarger record size, but would you want to? For me 128k/16k is only 8 database blocks; reminds me of using Oracle’s
readahead=8option :). 

Note: raidz does not alway stripe across all drives in the set like in a traditional raid set.  i.e. It might only use
2+pinstead of 8+p as configured — it depends on the size of the current ZFS record size being written out and free
space.


pgsql-admin by date:

Previous
From: Scott Ribe
Date:
Subject: Re: WAL & ZFS
Next
From: Scott Ribe
Date:
Subject: Re: WAL & ZFS