I have made some progress on the question of how to reproduce
these failures. If I do this:
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..c389a227be5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -178,7 +178,7 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
*/
while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
{
- if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not find WAL in archive \"%s\"",
privateInfo->archive_name);
then I get the "could not find WAL in archive" failures in
pg_verifybackup's gzip and lz4 tests, but not zstd. This happens
reproducibly even without any special hacks on XLOG_BLCKSZ or
wal_compression settings. Of course, this is not exactly what's
happening on batta/hachi, because they fail on zstd and not the
other two. But I think it confirms my theory that the problem
is essentially poor handling of EOF boundary conditions.
(Per discussion, there are other bugs here too; I don't mean
to minimize that aspect.)
regards, tom lane