Re: pg_waldump: support decoding of WAL inside tarfile - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_waldump: support decoding of WAL inside tarfile
Date
Msg-id 2360498.1774117435@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_waldump: support decoding of WAL inside tarfile  (Amul Sul <sulamul@gmail.com>)
List pgsql-hackers
I have made some progress on the question of how to reproduce
these failures.  If I do this:

diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..c389a227be5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -178,7 +178,7 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
     */
    while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
    {
-       if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+       if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
            pg_fatal("could not find WAL in archive \"%s\"",
                     privateInfo->archive_name);

then I get the "could not find WAL in archive" failures in
pg_verifybackup's gzip and lz4 tests, but not zstd.  This happens
reproducibly even without any special hacks on XLOG_BLCKSZ or
wal_compression settings.  Of course, this is not exactly what's
happening on batta/hachi, because they fail on zstd and not the
other two.  But I think it confirms my theory that the problem
is essentially poor handling of EOF boundary conditions.

(Per discussion, there are other bugs here too; I don't mean
to minimize that aspect.)

            regards, tom lane



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Fix slotsync worker busy loop causing repeated log messages
Next
From: Andres Freund
Date:
Subject: Re: Add RISC-V Zbb popcount optimization