Re: BUG #17846: pg_dump doesn't properly dump with paused WAL replay - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #17846: pg_dump doesn't properly dump with paused WAL replay
Date
Msg-id 941275.1679326784@sss.pgh.pa.us
Whole thread Raw
In response to BUG #17846: pg_dump doesn't properly dump with paused WAL replay  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
[ please keep the mailing list cc'd ]

Francisco Reinolds <francisco.reinolds@channable.com> writes:
> On 16-03-2023 16:10, Tom Lane wrote:
>> I really have no idea what's going on there, but can you show the exact
>> pg_dump command(s) being issued?  I'm particularly curious whether you
>> are using parallel dump.  The same for the failing pg_restore.

> Of course:

> - pg_dump: pg_dump --port 5432 --host localhost --verbose
> --format=directory --jobs=8 --file=<random_directory> --dbname=<dbname>
> - pg_restore: pg_restore --exit-on-error --cluster 13/<cluster_name>
> --dbname=<dbname> --port <port> --format=directory --jobs=8
> --use-list=/tmp/tmpsote5wvm --clean --if-exists <random directory>

Hmm, so the fact that the dump is being done in parallel is very likely
relevant.  Perhaps parallelism on the restore is also relevant, not
sure.  Can you try running each of those steps not-parallel to see
if the problem goes away?

I'm also slightly troubled by the --use-list option, and am wondering
if faulty creation of the restore list could be a contributing
factor.  The error looks like missing data row(s) not missing schema
objects; but perhaps if the problematic table(s) are partitioned
then one could lead to the other?  Could we see the DDL definition
for the problematic table(s)?

>> Also, are all the moving parts (primary server, secondary server,
>> pg_dump, pg_restore) exactly the same PG version?

> So, the version of both the primary and the secondary servers match, 13.8,
> but the server of the instance where we run the backup verifications does
> not, it's currently sitting at 13.6

Hmm.  With some unsupported assumptions about your schema, I could
believe that some of the 13.9 bug fixes are relevant, particularly

    * Fix construction of per-partition foreign key constraints while doing
    ALTER TABLE ATTACH PARTITION (Jehan-Guillaume de Rorthais, Álvaro
    Herrera)

    Previously, incorrect or duplicate constraints could be constructed
    for the newly-added partition.

            regards, tom lane



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #17855: Uninitialised memory used when the name type value processed in binary mode of Memoize
Next
From: Tom Lane
Date:
Subject: Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)