Re: Different compression methods for FPI - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Different compression methods for FPI
Date
Msg-id YKTa4+kTFkb3EIBs@paquier.xyz
Whole thread Raw
In response to Re: Different compression methods for FPI  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Different compression methods for FPI
List pgsql-hackers
On Tue, May 18, 2021 at 10:06:18PM -0500, Justin Pryzby wrote:
> On Mon, May 17, 2021 at 04:44:11PM +0900, Michael Paquier wrote:
>> On Sun, Mar 21, 2021 at 02:30:04PM -0500, Justin Pryzby wrote:
>>
>> For this patch, this is going to require a bit more in terms of library
>> linking as the block decompression is done in xlogreader.c, so that's one
>> thing to worry about.
>
> I'm not sure what you mean here ?

I was wondering if anything else was needed in terms of linking here.

>> Don't think you need a hook here, but zlib, or any other method which
>> is not supported by the build, had better not be listed instead.  This
>> ensures that the value cannot be assigned if the binaries don't
>> support that.
>
> I think you're confusing the walmethods struct (which is unconditional) with
> wal_compression_options, which is conditional.

Indeed I was.  For the note, walmethods is not necessary either as a
new structure.  You could just store a list of strings in xlogreader.c
and make a note to keep the entries in a given order.  That's simpler
as well.

>> The patch set is a gathering of various things, and not only things
>> associated to the compression method used for FPIs.
>> What is the point of that in patch 0002?
>>> Subject: [PATCH 03/12] Make sure published XIDs are persistent
>> Patch 0003 looks unrelated to this thread.
>
> ..for the reason that I gave:
>
> | And 2ndary patches from another thread to allow passing recovery tests.
> |These two patches are a prerequisite for this patch to progress:
> | * Run 011_crash_recovery.pl with wal_level=minimal
> | * Make sure published XIDs are persistent

I still don't understand why XID consistency has anything to do with
the compression of FPIs.  There is nothing preventing the testing of
compression of FPIs, and plese note this argument:
https://www.postgresql.org/message-id/BEF3B1E0-0B31-4F05-8E0A-F681CB918626@yandex-team.ru

For example, I can just revert from my tree 0002 and 0003, and still
perform tests of the various compression methods.  I do agree that we
are going to need to do something about this problem, but let's drop
this stuff from the set of patches of this thread and just discuss
them where they are needed.


>> +   {
>> +       {"wal_compression_method", PGC_SIGHUP, WAL_SETTINGS,
>> +           gettext_noop("Set the method used to compress full page images in the WAL."),
>> +           NULL
>> +       },
>> +       &wal_compression_method,
>> +       WAL_COMPRESSION_PGLZ, wal_compression_options,
>> +       NULL, NULL, NULL
>> +   },
>> The interface is not quite right to me.  I think that we should just
>> change wal_compression to become an enum, with extra values for pglz
>> and the new method.  "on" would be a synonym for "pglz".
>
> Andrey gave a reason in March:
>
> | I hope one day we will compress all WAL, not just FPIs. Advanced archive management tools already do so, why not
compressit in walwriter? 
> | When this will be implemented, we could have wal_compression = {off, fpi, all}.
>
> [...]

> I don't see why we'd add a guc for configuration compression but not include
> the 30 lines of code needed to support a 3rd method that we already used by the
> core server.

Because that makes things confusing.  We have no idea if we'll ever
reach a point or even if it makes sense to have compression applied to
multiple parts of WAL.  So, for now, let's just use one single GUC and
be done with it.  Your argument is not tied to what's proposed on this
thread either, and could go the other way around.  If we were to
invent more compression concepts in WAL records, we could as well just
go with a new GUC that lists the parts of the WAL where compression
needs to be applied.  I'd say to keep it to a minimum for now, that's
an interface less confusing than what's proposed here.

And you have not replaced BKPIMAGE_IS_COMPRESSED by a PGLZ-equivalent,
so your patch set is eating more bits for BKPIMAGE_* than it needs
to.

By the way, it would be really useful for the user to print in
pg_waldump -b the type of compression used :)

A last point, and I think that this should be part of a study of the
choice to made for an extra compression method: there is no discussion
yet about the level of compression applied, which is something that
can be applied to zstd, lz4 or even zlib.  Perhaps there is an
argument for a different GUC controlling that, so more benchmarking
is a first step needed for this thread to move on.  Benchmarking can
happen with what's currently posted, of course.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
Next
From: Bharath Rupireddy
Date:
Subject: Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options