Re: wal_compression=zstd - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: wal_compression=zstd
Date
Msg-id 20220309131411.GZ27651@telsasoft.com
Whole thread Raw
In response to Re: wal_compression=zstd  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: wal_compression=zstd  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Fri, Mar 04, 2022 at 05:44:06AM -0600, Justin Pryzby wrote:
> On Fri, Mar 04, 2022 at 04:19:32PM +0900, Michael Paquier wrote:
> > On Tue, Feb 22, 2022 at 05:19:48PM -0600, Justin Pryzby wrote:
> > 
> > > As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
> > > level is 6).
> > 
> > Why?  ZSTD using this default has its reasons, no?  And it would be
> > consistent to do the same for ZSTD as for the other two methods.
> 
> In my 1-off test, it gets 610/633 = 96% of the benefit at 209/273 = 77% of the
> cost.

Actually, my test used zstd-6, rather than the correct default of 3.

The comparison should have been:

postgres=# SET wal_compression='zstd-1';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; SELECT pg_stat_reset_shared('wal');
begin;CREATE INDEX ON t(a); rollback; SELECT * FROM pg_stat_wal;
 
Time: 2074.046 ms (00:02.074)
        2763 |    2758 |   6343591 |                0 |         5 |        5 |              0 |             0 |
2022-03-0505:04:08.599867-06
 


vs

postgres=# SET wal_compression='zstd-3';
postgres=# \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; SELECT pg_stat_reset_shared('wal');
begin;CREATE INDEX ON t(a); rollback; SELECT * FROM pg_stat_wal;
 
Time: 2471.552 ms (00:02.472)
 wal_records | wal_fpi | wal_bytes | wal_buffers_full | wal_write | wal_sync | wal_write_time | wal_sync_time |
stats_reset
 

-------------+---------+-----------+------------------+-----------+----------+----------------+---------------+-------------------------------
        2762 |    2746 |   6396890 |              274 |       274 |        0 |              0 |             0 |
2022-03-0505:04:31.283432-06
 

=> zstd-1 actually wrote less than zstd-3 (which is odd) but by an
insignificant amount.  It's no surprise that zstd-1 is faster than zstd-3, but
(of course) by a smaller amount than zstd-6.

Anyway there's no compelling reason to not use the default.  If we were to use
a non-default default, we'd have to choose between 1 and 2 (or some negative
compression level).  My thinking was that zstd-1 would give the lowest-hanging
fruits for zstd, while minimizing performance tradeoff, since WAL affects
interactivity.  But choosing between 1 and 2 seems like bikeshedding.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences