Re: pg_waldump: support decoding of WAL inside tarfile - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_waldump: support decoding of WAL inside tarfile
Date
Msg-id 3424809.1774234940@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_waldump: support decoding of WAL inside tarfile  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Everything's pushed, although it looks like it'll be a few hours
> before we see results from batta or hachi.

batta just went green, so I'm feeling optimistic that we've correctly
diagnosed and fixed the problem.

Meanwhile, I noticed one more thing that could use fixing: while
astreamer_zstd.c is careful to use a reasonably-sized output
bbs_buffer (about 256kB looks like), the decompressors in
astreamer_gzip.c and astreamer_lz4.c just leave bbs_buffer's size
at the default 1kB.  This means we're pushing decompressed data
to the next astreamer in very small units in those cases.

I suspect strongly that this difference in buffer sizes has a lot
to do with the observed results that only zstd failed on batta/hachi
while only the other two failed in the test cases that I've been
able to reproduce here.  I'm too lazy to try to run that to ground
though, especially since it's mostly moot now.  But I think that
on performance grounds, we ought to use a reasonable buffer size
for all three decompressors.

I also noticed that astreamer_zstd.c was using ZSTD_DStreamOutSize()
as the target buffer size for both compression and decompression,
although the libzstd API provides a separate function
ZSTD_CStreamOutSize() for the compression case.  Those two functions
produce the same result (256K) on my machine, so this seems to be just
a cosmetic issue, but it still seems pretty tin-eared to not use the
functions as specified.

Proposed patch attached.  There might be an argument for using some
other size than 256K for the other two decompressors, but my
inclination is to try to make all three use roughly the same block
size.  (See also 66ec01dc4.)

            regards, tom lane

diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index df392f67cab..5b3c3a17550 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -247,6 +247,8 @@ astreamer_gzip_decompressor_new(astreamer *next)

     streamer->base.bbs_next = next;
     initStringInfo(&streamer->base.bbs_buffer);
+    /* Use a buffer size comparable to the other decompressors */
+    enlargeStringInfo(&streamer->base.bbs_buffer, 256 * 1024 - 1);

     /* Initialize internal stream state for decompression */
     zs = &streamer->zstream;
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index 605c188007b..12dfde2c837 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -288,6 +288,8 @@ astreamer_lz4_decompressor_new(astreamer *next)

     streamer->base.bbs_next = next;
     initStringInfo(&streamer->base.bbs_buffer);
+    /* Use a buffer size comparable to the compressor's */
+    enlargeStringInfo(&streamer->base.bbs_buffer, 256 * 1024 - 1);

     /* Initialize internal stream state for decompression */
     ctxError = LZ4F_createDecompressionContext(&streamer->dctx, LZ4F_VERSION);
diff --git a/src/fe_utils/astreamer_zstd.c b/src/fe_utils/astreamer_zstd.c
index 4b43ab795e3..98e8a700efe 100644
--- a/src/fe_utils/astreamer_zstd.c
+++ b/src/fe_utils/astreamer_zstd.c
@@ -82,7 +82,7 @@ astreamer_zstd_compressor_new(astreamer *next, pg_compress_specification *compre

     streamer->base.bbs_next = next;
     initStringInfo(&streamer->base.bbs_buffer);
-    enlargeStringInfo(&streamer->base.bbs_buffer, ZSTD_DStreamOutSize());
+    enlargeStringInfo(&streamer->base.bbs_buffer, ZSTD_CStreamOutSize());

     streamer->cctx = ZSTD_createCCtx();
     if (!streamer->cctx)

pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: Use-after-free issue in postgres_fdw
Next
From: shveta malik
Date:
Subject: Re: [Proposal] Adding Log File Capability to pg_createsubscriber