Re: Different compression methods for FPI - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Different compression methods for FPI
Date
Msg-id 20210519030618.GS373@telsasoft.com
Whole thread Raw
In response to Re: Different compression methods for FPI  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Different compression methods for FPI
List pgsql-hackers
On Mon, May 17, 2021 at 04:44:11PM +0900, Michael Paquier wrote:
> On Sun, Mar 21, 2021 at 02:30:04PM -0500, Justin Pryzby wrote:
> 
> For this patch, this is going to require a bit more in terms of library
> linking as the block decompression is done in xlogreader.c, so that's one
> thing to worry about.

I'm not sure what you mean here ?

> +   {
> +       {"wal_compression_method", PGC_SIGHUP, WAL_SETTINGS,
> +           gettext_noop("Set the method used to compress full page images in the WAL."),
> +           NULL
> +       },
> +       &wal_compression_method,
> +       WAL_COMPRESSION_PGLZ, wal_compression_options,
> +       NULL, NULL, NULL
> +   },
> The interface is not quite right to me.  I think that we should just
> change wal_compression to become an enum, with extra values for pglz
> and the new method.  "on" would be a synonym for "pglz".

Andrey gave a reason in March:

| I hope one day we will compress all WAL, not just FPIs. Advanced archive management tools already do so, why not
compressit in walwriter?
 
| When this will be implemented, we could have wal_compression = {off, fpi, all}.

> +/* This is a mapping indexed by wal_compression */
> +// XXX: maybe this is better done as a GUC hook to assign the 1)
> method; and 2) level
> +struct walcompression walmethods[] = {
> +   {"pglz",    WAL_COMPRESSION_PGLZ},
> +   {"zlib",    WAL_COMPRESSION_ZLIB},
> +};
> Don't think you need a hook here, but zlib, or any other method which
> is not supported by the build, had better not be listed instead.  This
> ensures that the value cannot be assigned if the binaries don't
> support that.

I think you're confusing the walmethods struct (which is unconditional) with
wal_compression_options, which is conditional.

> The patch set is a gathering of various things, and not only things
> associated to the compression method used for FPIs.
> What is the point of that in patch 0002?
> > Subject: [PATCH 03/12] Make sure published XIDs are persistent
> Patch 0003 looks unrelated to this thread.

..for the reason that I gave:

| And 2ndary patches from another thread to allow passing recovery tests.
|These two patches are a prerequisite for this patch to progress:
| * Run 011_crash_recovery.pl with wal_level=minimal
| * Make sure published XIDs are persistent

> > Subject: [PATCH 04/12] wal_compression_method: default to zlib..
> 
> Patch 0004 could happen, however there are no reasons given why this
> is adapted.  Changing the default is not going to happen for the time
> release where this feature is added, anyway.

From the commit message:
| this is meant to exercise the CIs, and not meant to be merged

> +       default:
> +           report_invalid_record(record, "image at %X/%X is compressed with unsupported codec, block %d (%d/%s)",
> +                                 (uint32) (record->ReadRecPtr >> 32),
> +                                 (uint32) record->ReadRecPtr,
> +                                 block_id,
> +                                 compression_method,
> +                                 wal_compression_name(compression_method));
> +           return false;
> In xlogreader.c, the error message is helpful this way.  However, we
> would not know which compression method failed if there is a
> decompression failure for a method supported by the build restoring
> this block.  That would be good to add.

I don't undersand you here - that's what wal_compression_name is for ?
2021-05-18 21:38:04.324 CDT [26984] FATAL:  unknown compression method requested: 2(lz4)

> I think that what we actually need for this thread are patches 0001,
> 0005 and 0006 merged together to study first the performance we have
> with each one of the compression methods proposed, and then let's just
> pick one.  Reading around, zstd and zlib compresse more but take
> longer.  LZ4 is faster than the others, but can compress less.
> With limited bandwidth, less data makes sense, and my guess is that
> most users care most about the speed of recovery if we can afford
> speed with an acceptable compression ratio.

I don't see why we'd add a guc for configuration compression but not include
the 30 lines of code needed to support a 3rd method that we already used by the
core server.

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
Next
From: Michael Paquier
Date:
Subject: Re: Multiple pg_waldump --rmgr options