Re: pg_waldump: support decoding of WAL inside tarfile - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: pg_waldump: support decoding of WAL inside tarfile
Date
Msg-id CA+hUKGKfti_FMFuduXEZs96W5Boce9gSLZ5Ei158dFiuLuWLgA@mail.gmail.com
Whole thread Raw
In response to Re: pg_waldump: support decoding of WAL inside tarfile  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_waldump: support decoding of WAL inside tarfile
List pgsql-hackers
On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > How about using --format=ustar, instead of that sparse control stuff?
>
> I did it that way for GNU tar, but did not research whether bsdtar
> will take that option.  Feel free to hack on ebba64c08 some more.
>
> (It seems though that the two tars' locutions for "write to stdout"
> are different, so we might have to have separate tests even if they
> end up pushing the same option.)

I have:

$ tar --version
bsdtar 3.8.2 - libarchive 3.8.2 zlib/1.3.1 liblzma/5.8.1 libzstd/1.5.2
openssl/3.5.4 libb2/bundled
$ gtar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.

This seems to work for both:

$ tar --format=ustar -c /dev/null  > /dev/null
tar: Removing leading '/' from member names
$ gtar --format=ustar -c /dev/null  > /dev/null
gtar: Removing leading `/' from member names

The attached passes with both, and regress_log_001_basic looks like:

# Running: /usr/bin/tar --format=ustar -cf /tmp/J_ifbfUOSd/pg_wal.tar
archive_status 000000010000000000000001 000000010000000000000003
000000010000000000000002 summaries
[12:12:24.301](0.072s) ok 101

# Running: /usr/local/bin/gtar --format=ustar -cf
/tmp/pbdsHdrAdw/pg_wal.tar 000000010000000000000002 archive_status
000000010000000000000003 summaries 000000010000000000000001
[12:18:14.739](0.050s) ok 101

I think a Windows system could be using either.  BSD tar comes
pre-installed by Microsoft and people often install GNU tools.  So I
think we should use File::Spec->devnull() instead of /dev/null, and
Andrew showed that working.  I doubt Windows is capable of making
sparse files (except perhaps with ReFS?), but it's nice to use the
same code everywhere and future-proof in case GNU carries out its
thread to switch to pax by default.  Windows probably has file
attributes that ustar can't represent (?), so I guess that might
motivate it to use pax headers if they are indeed added only when
needed.

Longer term I think we need to tolerate but ignore pax headers.  If I
understand the spirit of this long evolution, pax archives are
intended to be acceptable to pre-pax implementations, which implies
that they can't really change the meaning of the bits of the file
contents.  That's why GNU's --sparse hides funky file encodings from
old tars by renaming them to GNUSparseFile.%p/%f, and that leads back
to my original suggestion that we should figure out how to detect and
reject pax only if we failed to find the file under the expected name.
(Or of course we could just implement support for that, and I have a
half-baked trial patch for that but now is not the time.)

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_plan_advice
Next
From: surya poondla
Date:
Subject: Re: heapam_tuple_complete_speculative : remove unnecessary tuple fetch