Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h} - Mailing list pgsql-hackers

From gkokolatos@pm.me
Subject Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}
Date
Msg-id Pw7ltgmMRlWiJ8w3EW_1Paoeaf1QIJtV-c-U9UgF8f12w3wxpFbqwp97hr3OchnlOFetTA50zYcMQ3coWDutqP3a1qsG2SXOKipKY3lNqBE=@pm.me
Whole thread Raw
In response to Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}
List pgsql-hackers
On Monday, April 11th, 2022 at 8:52 AM, Michael Paquier <michael@paquier.xyz>
wrote:

> This is something I think we had better fix before beta1, because now
> we have binaries that use an inconsistent set of options. So,
> attached is a patch set aimed at rework this option set from the
> ground, taking advantage of the recent work done by Robert and others
> for pg_basebackup:

Agreed. It is rather inconsistent now.

> - 0001 is a simple rename of backup_compression.{c,h} to
> compression.{c,h}, removing anything related to base backups from
> that. One extra reason behind this renaming is that I would like to
> use this infra for pg_dump, but that's material for 16~.

I agree with the design. If you permit me a couple of nitpicks regarding naming.

+typedef enum pg_compress_algorithm
+{
+   PG_COMPRESSION_NONE,
+   PG_COMPRESSION_GZIP,
+   PG_COMPRESSION_LZ4,
+   PG_COMPRESSION_ZSTD
+} pg_compress_algorithm;

Elsewhere in the codebase, (e.g. create_table.sgml, alter_table.sgml,
brin_tuple.c, detoast.c, toast_compression.c, tupdesc.c, gist.c to mention a
few) variations of of the nomenclature "compression method" are used, like
'VARDATA_COMPRESSED_GET_COMPRESS_METHOD' or 'InvalidCompressionMethod' etc. I
feel that it would be nicer if we followed one naming rule for this and I
recommend to substitute algorithm for method throughout.

On a similar note, it would help readability to be able to distinguish at a
glance the type from the variable. Maybe uppercase or camelcase the type?

Last, even though it is not needed now, it will be helpful to have a
PG_COMPRESSION_INVALID in some scenarios. Though we can add it when we come to
it.

> - 0002 removes WalCompressionMethod, replacing it by
> pg_compress_algorithm as these are the same enums. Robert complained
> about the confusion that WalCompressionMethod could lead to as this
> could be used for the compression of data, and not only WAL. I have
> renamed some variables to be more consistent, while on it.

It looks good. If you choose to discard the comment regarding the use of
'method' over 'algorithm' from above, can you please use the full word in the
variable, e.g. 'wal_compress_algorithm' instead of 'wal_compress_algo'. I can
not really explain it, the later reads a bit rude. Then again that may be just
me.

> - 0003 is the actual option rework for pg_receivewal. This takes
> advantage of 0001, leading to the result of removing --compress-method
> and replacing it with --compress, taking care of the backward
> compatibility problems for an integer value, aka 0 implies no
> compression and val > 0 implies gzip. One bonus reason to switch to
> that is that this would make the addition of zstd for pg_receivewal
> easier in the future.

Looks good.

> I am going to add an open item for this stuff. Comments or thoughts?

I agree that it is better to not release pg_receivewal with the distinct set of
options.

Cheers,
//Georgios

> Thanks,
> --
> Michael



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: [PATCH] Add extra statistics to explain for Nested Loop
Next
From: "Euler Taveira"
Date:
Subject: Re: Support logical replication of DDLs