On Sun, Oct 12, 2025 at 01:24:37PM -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> At the end I am finishing with the attached. I also saw an overlap
>> with the addition of --jobs for the directory format vs not using the
>> option, so I have removed the case where --jobs was not used in the
>> directory format.
>
> (This patch became commit 98fe74218.)
>
> I am wondering if you remember why this bit:
>
> + # Give coverage for manually compressed blob.toc files during
> + # restore.
> + compress_cmd => {
> + program => $ENV{'GZIP_PROGRAM'},
> + args => [ '-f', "$tempdir/compression_gzip_dir/blobs.toc", ],
> + },
>
> was set up to manually compress blobs.toc but not the main TOC in the
> toc.dat file. It turns out that Gzip_read is broken for the case
> of a zero-length read request [1], but we never reach that case
> unless toc.dat is compressed. We don't cover the getc_func member
> of the compression stream API, either.
Using extra commands to emulate the manual compressions of the dump
elements is an idea that comes originally from Georgios, and we
included in the first version of the patch posted on the original
thread:
https://www.postgresql.org/message-id/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss=@protonmail.com
If I recall correctly, it's something that the two of us have
discussed offline because Georgios had sent the patch to the lists.
He has argued about these additions in favor of coverage, because he
had noticed a bunch of edge cases we never reached with only the tests
in core while coding compression support. Regarding toc.dat
specifically, I think that we have just failed to consider it as a
"valid" case, as far as I can see. So I see no reason to not adding
coverage where toc.dat is manuall compressed, and you are giving a
good reason to add this part in the test.
> I thought for a bit about proposing that we compress toc.dat but not
> blobs.toc, but that loses coverage in another way: the gets_func API
> turns out to be used only while reading blobs.toc. So we need to
> compress both files manually if we want full coverage.
>
> I think this change won't lose coverage, because there are other tests
> in 002_pg_dump.pl that exercise directory format without extra
> compression of anything.
>
> Thoughts?
It looks like it should be OK to just add a "manual" compression of
toc.dat in compression_gzip_dir. That sounds OK to me here. As far
as I can see on HEAD at 1c05fe11abb6, the problem has been fixed in
1f8062dd9668, without a test added for toc.dat. Having coverage would
be nice. I would also expand this concept to compression_zstd_dir and
compression_lz4_dir, but I guess that you are implying that already to
provide coverage for all the compression methods supported by pg_dump?
--
Michael