pgsql: Add support for LZ4 compression in pg_receivewal - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Add support for LZ4 compression in pg_receivewal
Date
Msg-id E1mip3h-0003id-Ss@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Add support for LZ4 compression in pg_receivewal

pg_receivewal gains a new option, --compression-method=lz4, available
when the code is compiled with --with-lz4.  Similarly to gzip, this
gives the possibility to compress archived WAL segments with LZ4.  This
option is not compatible with --compress.

The implementation uses LZ4 frames, and is compatible with simple lz4
commands.  Like gzip, using --synchronous ensures that any data will be
flushed to disk within the current .partial segment, so as it is
possible to retrieve as much WAL data as possible even from a
non-completed segment (this requires completing the partial file with
zeros up to the WAL segment size supported by the backend after
decompression, but this is the same as gzip).

The calculation of the streaming start LSN is able to transparently find
and check LZ4-compressed segments.  Contrary to gzip where the
uncompressed size is directly stored in the object read, the LZ4 chunk
protocol does not store the uncompressed data by default.  There is
contentSize that can be used with LZ4 frames by that would not help if
using an archive that includes segments compressed with the defaults of
a "lz4" command, where this is not stored.  So, this commit has taken
the most extensible approach by decompressing the already-archived
segment to check its uncompressed size, through a blank output buffer in
chunks of 64kB (no actual performance difference noticed with 8kB, 16kB
or 32kB, and the operation in itself is actually fast).

Tests have been added to verify the creation and correctness of the
generated LZ4 files.  The latter is achieved by the use of command
"lz4", if found in the environment.

The tar-based WAL method in walmethods.c, used now only by
pg_basebackup, does not know yet about LZ4.  Its code could be extended
for this purpose.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar
Discussion:
https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/babbbb595d2322da095a1e6703171b3f1f2815cb

Modified Files
--------------
doc/src/sgml/ref/pg_receivewal.sgml          |   8 +-
src/Makefile.global.in                       |   1 +
src/bin/pg_basebackup/Makefile               |   1 +
src/bin/pg_basebackup/pg_receivewal.c        | 158 +++++++++++++++++++++++++-
src/bin/pg_basebackup/t/020_pg_receivewal.pl |  72 ++++++++++--
src/bin/pg_basebackup/walmethods.c           | 160 ++++++++++++++++++++++++++-
src/bin/pg_basebackup/walmethods.h           |   1 +
7 files changed, 388 insertions(+), 13 deletions(-)


pgsql-committers by date:

Previous
From: Peter Geoghegan
Date:
Subject: pgsql: Add various assertions to heap pruning code.
Next
From: Peter Geoghegan
Date:
Subject: pgsql: Add hardening to catch invalid TIDs in indexes.