Re: pg_waldump: support decoding of WAL inside tarfile - Mailing list pgsql-hackers
| From | Tom Lane |
|---|---|
| Subject | Re: pg_waldump: support decoding of WAL inside tarfile |
| Date | |
| Msg-id | 3424809.1774234940@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: pg_waldump: support decoding of WAL inside tarfile (Tom Lane <tgl@sss.pgh.pa.us>) |
| List | pgsql-hackers |
I wrote:
> Everything's pushed, although it looks like it'll be a few hours
> before we see results from batta or hachi.
batta just went green, so I'm feeling optimistic that we've correctly
diagnosed and fixed the problem.
Meanwhile, I noticed one more thing that could use fixing: while
astreamer_zstd.c is careful to use a reasonably-sized output
bbs_buffer (about 256kB looks like), the decompressors in
astreamer_gzip.c and astreamer_lz4.c just leave bbs_buffer's size
at the default 1kB. This means we're pushing decompressed data
to the next astreamer in very small units in those cases.
I suspect strongly that this difference in buffer sizes has a lot
to do with the observed results that only zstd failed on batta/hachi
while only the other two failed in the test cases that I've been
able to reproduce here. I'm too lazy to try to run that to ground
though, especially since it's mostly moot now. But I think that
on performance grounds, we ought to use a reasonable buffer size
for all three decompressors.
I also noticed that astreamer_zstd.c was using ZSTD_DStreamOutSize()
as the target buffer size for both compression and decompression,
although the libzstd API provides a separate function
ZSTD_CStreamOutSize() for the compression case. Those two functions
produce the same result (256K) on my machine, so this seems to be just
a cosmetic issue, but it still seems pretty tin-eared to not use the
functions as specified.
Proposed patch attached. There might be an argument for using some
other size than 256K for the other two decompressors, but my
inclination is to try to make all three use roughly the same block
size. (See also 66ec01dc4.)
regards, tom lane
diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index df392f67cab..5b3c3a17550 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -247,6 +247,8 @@ astreamer_gzip_decompressor_new(astreamer *next)
streamer->base.bbs_next = next;
initStringInfo(&streamer->base.bbs_buffer);
+ /* Use a buffer size comparable to the other decompressors */
+ enlargeStringInfo(&streamer->base.bbs_buffer, 256 * 1024 - 1);
/* Initialize internal stream state for decompression */
zs = &streamer->zstream;
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index 605c188007b..12dfde2c837 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -288,6 +288,8 @@ astreamer_lz4_decompressor_new(astreamer *next)
streamer->base.bbs_next = next;
initStringInfo(&streamer->base.bbs_buffer);
+ /* Use a buffer size comparable to the compressor's */
+ enlargeStringInfo(&streamer->base.bbs_buffer, 256 * 1024 - 1);
/* Initialize internal stream state for decompression */
ctxError = LZ4F_createDecompressionContext(&streamer->dctx, LZ4F_VERSION);
diff --git a/src/fe_utils/astreamer_zstd.c b/src/fe_utils/astreamer_zstd.c
index 4b43ab795e3..98e8a700efe 100644
--- a/src/fe_utils/astreamer_zstd.c
+++ b/src/fe_utils/astreamer_zstd.c
@@ -82,7 +82,7 @@ astreamer_zstd_compressor_new(astreamer *next, pg_compress_specification *compre
streamer->base.bbs_next = next;
initStringInfo(&streamer->base.bbs_buffer);
- enlargeStringInfo(&streamer->base.bbs_buffer, ZSTD_DStreamOutSize());
+ enlargeStringInfo(&streamer->base.bbs_buffer, ZSTD_CStreamOutSize());
streamer->cctx = ZSTD_createCCtx();
if (!streamer->cctx)
pgsql-hackers by date: