Thread: BUG #16732: pg_dump creates broken backups

BUG #16732: pg_dump creates broken backups

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16732
Logged by:          Zsolt Ero
Email address:      zsolt.ero@gmail.com
PostgreSQL version: 12.4
Operating system:   Ubuntu 16.04
Description:

On a production server (12.4) backups are created by the following
command:
pg_dump --dbname=app --format=custom --file=$BACKUP_DIR/app.dump

Then on any other machine (can be the same server, can be a my macOS laptop
running 12.5), this command always fails (or it'd fail with
--single-transaction):
pg_restore --dbname=maphub_web --format=custom $BACKUP_FILE

The error is always:
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 2976; 0 63529 TABLE DATA map_versions app
pg_restore: error: COPY failed for table "map_versions": ERROR:  insert or
update on table "map_versions" violates foreign key constraint
"fk_map_versions_map_id_maps"
DETAIL:  Key (map_id)=(112664) is not present in table "maps".

The target database is cleaned with dropdb + createdb before pg_restore. I
compared the backups by md5, the file is not-corrupt.

This is quite shocking in a way, meaning that right now I couldn't restore
my prod server in case I'd need to.


Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
Sorry this is not so general, I over-simplified the example. The key is --use-list (with possibly --jobs)

What I've double checked is that the following breaks it consistently:

pg_restore --schema-only --dbname=app --format=custom --single-transaction $BACKUP_FILE

pg_restore --list $BACKUP_FILE > dump_list.full
pg_restore --data-only --dbname=app --format=custom --use-list=dump_list.full --jobs=4 $BACKUP_FILE

I think it occured without --jobs=4 but I need to confirm that tomorrow.

Zsolt



On 20 Nov 2020 at 02:13:12, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      16732
Logged by:          Zsolt Ero
Email address:      zsolt.ero@gmail.com
PostgreSQL version: 12.4
Operating system:   Ubuntu 16.04
Description:        

On a production server (12.4) backups are created by the following
command:
pg_dump --dbname=app --format=custom --file=$BACKUP_DIR/app.dump

Then on any other machine (can be the same server, can be a my macOS laptop
running 12.5), this command always fails (or it'd fail with
--single-transaction):
pg_restore --dbname=maphub_web --format=custom $BACKUP_FILE

The error is always:
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 2976; 0 63529 TABLE DATA map_versions app
pg_restore: error: COPY failed for table "map_versions": ERROR:  insert or
update on table "map_versions" violates foreign key constraint
"fk_map_versions_map_id_maps"
DETAIL:  Key (map_id)=(112664) is not present in table "maps".

The target database is cleaned with dropdb + createdb before pg_restore. I
compared the backups by md5, the file is not-corrupt.

This is quite shocking in a way, meaning that right now I couldn't restore
my prod server in case I'd need to.

Re: BUG #16732: pg_dump creates broken backups

From
Tom Lane
Date:
Zsolt Ero <zsolt.ero@gmail.com> writes:
>  Sorry this is not so general, I over-simplified the example. The key is
> --use-list (with possibly --jobs)

> What I've double checked is that the following breaks it consistently:

> pg_restore --schema-only --dbname=app --format=custom --single-transaction
> $BACKUP_FILE

> pg_restore --list $BACKUP_FILE > dump_list.full
> pg_restore --data-only --dbname=app --format=custom
> --use-list=dump_list.full --jobs=4 $BACKUP_FILE

I suspect actually the problem has more to do with having split the
restore into --schema-only and --data-only steps.  That forces
pg_restore to create the FK constraints before it's loaded the
data, and in certain cases such as circular FK relationships,
there will be no safe restore order for the data.

However, that's all just speculation since you haven't shown us
a reproducible case.

            regards, tom lane



Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
It happens with 1 row in a 3 GB gzip compressed database dump. I'm thinking about how could I possibly give you a reproducible case. Do you know any way which doesn't require me to share the whole production database? (which is not an option)

I can send you a --schema-only sql dump, if that helps. There are no circular relationships that I know.





On 20 Nov 2020 at 17:28:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zsolt Ero <zsolt.ero@gmail.com> writes:
 Sorry this is not so general, I over-simplified the example. The key is
--use-list (with possibly --jobs)

What I've double checked is that the following breaks it consistently:

pg_restore --schema-only --dbname=app --format=custom --single-transaction
$BACKUP_FILE

pg_restore --list $BACKUP_FILE > dump_list.full
pg_restore --data-only --dbname=app --format=custom
--use-list=dump_list.full --jobs=4 $BACKUP_FILE

I suspect actually the problem has more to do with having split the
restore into --schema-only and --data-only steps.  That forces
pg_restore to create the FK constraints before it's loaded the
data, and in certain cases such as circular FK relationships,
there will be no safe restore order for the data.

However, that's all just speculation since you haven't shown us
a reproducible case.

regards, tom lane

Re: BUG #16732: pg_dump creates broken backups

From
Tom Lane
Date:
Zsolt Ero <zsolt.ero@gmail.com> writes:
>  It happens with 1 row in a 3 GB gzip compressed database dump. I'm
> thinking about how could I possibly give you a reproducible case. Do you
> know any way which doesn't require me to share the whole production
> database? (which is not an option)

Of course not.  Can you make it happen with a few rows of dummy data
within the same schema?

            regards, tom lane



Re: BUG #16732: pg_dump creates broken backups

From
"David G. Johnston"
Date:
On Fri, Nov 20, 2020 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zsolt Ero <zsolt.ero@gmail.com> writes:
>  It happens with 1 row in a 3 GB gzip compressed database dump. I'm
> thinking about how could I possibly give you a reproducible case. Do you
> know any way which doesn't require me to share the whole production
> database? (which is not an option)

Of course not.  Can you make it happen with a few rows of dummy data
within the same schema?


Before doing that, have you positively confirmed that map_id=112664 exists on the maps table in the live database?  Your follow-on post is difficult to follow.  The claim about "1 record" in a database would suggest corruption, not a structural problem - which should affect entire tables (though you'd only see the first error).  In the dump file, is 112664 the first ID in the table data?

David J.

Re: BUG #16732: pg_dump creates broken backups

From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Before doing that, have you positively confirmed that map_id=112664 exists
> on the maps table in the live database?

Nah, there's no reason to think there's any corruption problem.  After
digging in the code a bit, I confirmed my theory that the problem is
the --data-only flag.  The way Zsolt is doing this, pg_restore is
making no attempt at all to order the tables in a way that respects
the FK relationships.  There is code for that, but it's on the pg_dump
side, and it's only executed if you said --data-only during the *dump*
operation:

 * getTableDataFKConstraints -
 *      add dump-order dependencies reflecting foreign key constraints
 *
 * This code is executed only in a data-only dump --- in schema+data dumps
 * we handle foreign key issues by not creating the FK constraints until
 * after the data is loaded.  In a data-only dump, however, we want to
 * order the table data objects in such a way that a table's referenced
 * tables are restored first.  (In the presence of circular references or
 * self-references this may be impossible; we'll detect and complain about
 * that during the dependency sorting step.)

This is not terribly friendly (and certainly not documented at the
user level).  I wonder how hard it would be to improve matters.
Offhand, it seems like it might be possible to run the same code
in pg_restore, when it's told to do a data-only restore; however
pg_restore does not do dependency sorting, so I'm not sure if it
could figure out how to cope with circular FK dependencies.

In the meantime, the options are (1) don't do it like that, or
(2) use --disable-triggers and just trust that the FK constraints
are satisfied.  Or maybe (3) drop the FK constraints and then
recreate/revalidate them after the data restore.

            regards, tom lane



Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
I didn't state why I'm doing all this: basically all I'd like to do is exclude a table when restoring data.
The only way I managed to do this was to grep -v the list file before data only restore.



On 2020. Nov 21., Sat at 3:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Before doing that, have you positively confirmed that map_id=112664 exists
> on the maps table in the live database?

Nah, there's no reason to think there's any corruption problem.  After
digging in the code a bit, I confirmed my theory that the problem is
the --data-only flag.  The way Zsolt is doing this, pg_restore is
making no attempt at all to order the tables in a way that respects
the FK relationships.  There is code for that, but it's on the pg_dump
side, and it's only executed if you said --data-only during the *dump*
operation:

 * getTableDataFKConstraints -
 *        add dump-order dependencies reflecting foreign key constraints
 *
 * This code is executed only in a data-only dump --- in schema+data dumps
 * we handle foreign key issues by not creating the FK constraints until
 * after the data is loaded.  In a data-only dump, however, we want to
 * order the table data objects in such a way that a table's referenced
 * tables are restored first.  (In the presence of circular references or
 * self-references this may be impossible; we'll detect and complain about
 * that during the dependency sorting step.)

This is not terribly friendly (and certainly not documented at the
user level).  I wonder how hard it would be to improve matters.
Offhand, it seems like it might be possible to run the same code
in pg_restore, when it's told to do a data-only restore; however
pg_restore does not do dependency sorting, so I'm not sure if it
could figure out how to cope with circular FK dependencies.

In the meantime, the options are (1) don't do it like that, or
(2) use --disable-triggers and just trust that the FK constraints
are satisfied.  Or maybe (3) drop the FK constraints and then
recreate/revalidate them after the data restore.

                        regards, tom lane

Re: BUG #16732: pg_dump creates broken backups

From
Alvaro Herrera
Date:
On 2020-Nov-21, Zsolt Ero wrote:

> I didn't state why I'm doing all this: basically all I'd like to do is
> exclude a table when restoring data.
> The only way I managed to do this was to grep -v the list file before data
> only restore.

Why do you do that?  It seems much easier to produce a complete dump,
then obtain the --list from it, do "grep -v" of the TABLE DATA element
for that table, then give that file to pg_restore.  It would restore
everything in the right order, including that table's definition, but
excluding that table's data.



Re: BUG #16732: pg_dump creates broken backups

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Why do you do that?  It seems much easier to produce a complete dump,
> then obtain the --list from it, do "grep -v" of the TABLE DATA element
> for that table, then give that file to pg_restore.  It would restore
> everything in the right order, including that table's definition, but
> excluding that table's data.

Yeah, it's clearly possible to dodge the issue by using a different
dump/restore procedure.  Still, this procedure is not obviously
incorrect, so it'd be nice if pg_restore coped better.

            regards, tom lane



Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
Why do you do that?  It seems much easier to produce a complete dump,
then obtain the --list from it, do "grep -v" of the TABLE DATA element
for that table, then give that file to pg_restore.  It would restore
everything in the right order, including that table's definition, but
excluding that table's data.

Can you help me how can I do this?

Here is what I was doing before, which triggered the bug:

pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE

pg_restore --list $BACKUP_FILE > dump_list.full
grep -v "public events_map" dump_list.full > dump_list.main

pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4 $BACKUP_FILE

If it's not clear, I'd like to restore the original database 1:1, but without the data included in the "events_map" table. I mean the results should be the same as if I'd run truncate after import.

Zsolt

Re: BUG #16732: pg_dump creates broken backups

From
Alvaro Herrera
Date:
On 2020-Nov-22, Zsolt Ero wrote:

> Can you help me how can I do this?
> 
> Here is what I was doing before, which triggered the bug:
> 
> pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE
> 
> pg_restore --list $BACKUP_FILE > dump_list.full
> grep -v "public events_map" dump_list.full > dump_list.main
> 
> pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4
> $BACKUP_FILE

I'd do this:

pg_restore --list $BACKUP_FILE | grep -v "TABLE DATA public events_map" > dump_list
pg_restore -j4 --use_list=dump_list $BACKUP_FILE



Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
I get it. But then I need a separate step for the missing table's schema, right?

On 2020. Nov 22., Sun at 15:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2020-Nov-22, Zsolt Ero wrote:

> Can you help me how can I do this?
>
> Here is what I was doing before, which triggered the bug:
>
> pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE
>
> pg_restore --list $BACKUP_FILE > dump_list.full
> grep -v "public events_map" dump_list.full > dump_list.main
>
> pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4
> $BACKUP_FILE

I'd do this:

pg_restore --list $BACKUP_FILE | grep -v "TABLE DATA public events_map" > dump_list
pg_restore -j4 --use_list=dump_list $BACKUP_FILE

Re: BUG #16732: pg_dump creates broken backups

From
Alvaro Herrera
Date:
On 2020-Nov-22, Zsolt Ero wrote:

> I get it. But then I need a separate step for the missing table's schema,
> right?

No, you don't.  grep-v'ing for just TABLE DATA excludes the data, but
not the definition.

Please don't top-post.



Re: BUG #16732: pg_dump creates broken backups

From
Zsolt Ero
Date:
No, you don't.  grep-v'ing for just TABLE DATA excludes the data, but
not the definition.

Thanks, I understand it and it works perfectly like that!