pgsql: pg_dump: Use only LZ4 frame format for compression - Mailing list pgsql-committers

From Tomas Vondra
Subject pgsql: pg_dump: Use only LZ4 frame format for compression
Date
Msg-id E1piO6J-000jVW-VL@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
pg_dump: Use only LZ4 frame format for compression

After 0da243fed0 got committed, it was reported that in some cases the
compression ratio is rather poor - particularly for custom format with
narrow tables - due to writing the LZ4 header/footer for each row.

This commit switches to LZ4F (LZ4 frame format), eliminating most of the
overhead and greatly improving the compression ratio. This makes the
compressed size about the same for plain and custom formats (just like
for gzip, for example).

LZ4F is now used by both compression APIs, which allowed refactoring and
reusing more of the code. For consistency this also renames the LZ4File
struct to LZ4State, and a number of functions are now prefixed with
LZ4Stream_ (instead of LZ4File_).

Patch by Georgios Kokolatos, based on report and initial patch by Justin
Pryzby. Review and minor cleanups by me.

Author: Georgios Kokolatos, Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20230227044910.GO1653%40telsasoft.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/0070b66fef21e909adb283f7faa7b1978836ad75

Modified Files
--------------
src/bin/pg_dump/compress_lz4.c | 554 +++++++++++++++++++++++++----------------
1 file changed, 343 insertions(+), 211 deletions(-)


pgsql-committers by date:

Previous
From: David Rowley
Date:
Subject: pgsql: Doc: add Buffer Access Strategy to the glossary
Next
From: Andres Freund
Date:
Subject: pgsql: hio: Release extension lock before initializing page / pinning V