Thread: BUG #16732: pg_dump creates broken backups
The following bug has been logged on the website: Bug reference: 16732 Logged by: Zsolt Ero Email address: zsolt.ero@gmail.com PostgreSQL version: 12.4 Operating system: Ubuntu 16.04 Description: On a production server (12.4) backups are created by the following command: pg_dump --dbname=app --format=custom --file=$BACKUP_DIR/app.dump Then on any other machine (can be the same server, can be a my macOS laptop running 12.5), this command always fails (or it'd fail with --single-transaction): pg_restore --dbname=maphub_web --format=custom $BACKUP_FILE The error is always: pg_restore: while PROCESSING TOC: pg_restore: from TOC entry 2976; 0 63529 TABLE DATA map_versions app pg_restore: error: COPY failed for table "map_versions": ERROR: insert or update on table "map_versions" violates foreign key constraint "fk_map_versions_map_id_maps" DETAIL: Key (map_id)=(112664) is not present in table "maps". The target database is cleaned with dropdb + createdb before pg_restore. I compared the backups by md5, the file is not-corrupt. This is quite shocking in a way, meaning that right now I couldn't restore my prod server in case I'd need to.
Sorry this is not so general, I over-simplified the example. The key is --use-list (with possibly --jobs)
What I've double checked is that the following breaks it consistently:
pg_restore --schema-only --dbname=app --format=custom --single-transaction $BACKUP_FILE
pg_restore --list $BACKUP_FILE > dump_list.full
pg_restore --data-only --dbname=app --format=custom --use-list=dump_list.full --jobs=4 $BACKUP_FILE
I think it occured without --jobs=4 but I need to confirm that tomorrow.
Zsolt
On 20 Nov 2020 at 02:13:12, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 16732
Logged by: Zsolt Ero
Email address: zsolt.ero@gmail.com
PostgreSQL version: 12.4
Operating system: Ubuntu 16.04
Description:
On a production server (12.4) backups are created by the following
command:
pg_dump --dbname=app --format=custom --file=$BACKUP_DIR/app.dump
Then on any other machine (can be the same server, can be a my macOS laptop
running 12.5), this command always fails (or it'd fail with
--single-transaction):
pg_restore --dbname=maphub_web --format=custom $BACKUP_FILE
The error is always:
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 2976; 0 63529 TABLE DATA map_versions app
pg_restore: error: COPY failed for table "map_versions": ERROR: insert or
update on table "map_versions" violates foreign key constraint
"fk_map_versions_map_id_maps"
DETAIL: Key (map_id)=(112664) is not present in table "maps".
The target database is cleaned with dropdb + createdb before pg_restore. I
compared the backups by md5, the file is not-corrupt.
This is quite shocking in a way, meaning that right now I couldn't restore
my prod server in case I'd need to.
Zsolt Ero <zsolt.ero@gmail.com> writes: > Sorry this is not so general, I over-simplified the example. The key is > --use-list (with possibly --jobs) > What I've double checked is that the following breaks it consistently: > pg_restore --schema-only --dbname=app --format=custom --single-transaction > $BACKUP_FILE > pg_restore --list $BACKUP_FILE > dump_list.full > pg_restore --data-only --dbname=app --format=custom > --use-list=dump_list.full --jobs=4 $BACKUP_FILE I suspect actually the problem has more to do with having split the restore into --schema-only and --data-only steps. That forces pg_restore to create the FK constraints before it's loaded the data, and in certain cases such as circular FK relationships, there will be no safe restore order for the data. However, that's all just speculation since you haven't shown us a reproducible case. regards, tom lane
It happens with 1 row in a 3 GB gzip compressed database dump. I'm thinking about how could I possibly give you a reproducible case. Do you know any way which doesn't require me to share the whole production database? (which is not an option)
I can send you a --schema-only sql dump, if that helps. There are no circular relationships that I know.
On 20 Nov 2020 at 17:28:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zsolt Ero <zsolt.ero@gmail.com> writes:Sorry this is not so general, I over-simplified the example. The key is--use-list (with possibly --jobs)What I've double checked is that the following breaks it consistently:pg_restore --schema-only --dbname=app --format=custom --single-transaction$BACKUP_FILEpg_restore --list $BACKUP_FILE > dump_list.fullpg_restore --data-only --dbname=app --format=custom--use-list=dump_list.full --jobs=4 $BACKUP_FILE
I suspect actually the problem has more to do with having split the
restore into --schema-only and --data-only steps. That forces
pg_restore to create the FK constraints before it's loaded the
data, and in certain cases such as circular FK relationships,
there will be no safe restore order for the data.
However, that's all just speculation since you haven't shown us
a reproducible case.
regards, tom lane
Zsolt Ero <zsolt.ero@gmail.com> writes: > It happens with 1 row in a 3 GB gzip compressed database dump. I'm > thinking about how could I possibly give you a reproducible case. Do you > know any way which doesn't require me to share the whole production > database? (which is not an option) Of course not. Can you make it happen with a few rows of dummy data within the same schema? regards, tom lane
On Fri, Nov 20, 2020 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zsolt Ero <zsolt.ero@gmail.com> writes:
> It happens with 1 row in a 3 GB gzip compressed database dump. I'm
> thinking about how could I possibly give you a reproducible case. Do you
> know any way which doesn't require me to share the whole production
> database? (which is not an option)
Of course not. Can you make it happen with a few rows of dummy data
within the same schema?
Before doing that, have you positively confirmed that map_id=112664 exists on the maps table in the live database? Your follow-on post is difficult to follow. The claim about "1 record" in a database would suggest corruption, not a structural problem - which should affect entire tables (though you'd only see the first error). In the dump file, is 112664 the first ID in the table data?
"David G. Johnston" <david.g.johnston@gmail.com> writes: > Before doing that, have you positively confirmed that map_id=112664 exists > on the maps table in the live database? Nah, there's no reason to think there's any corruption problem. After digging in the code a bit, I confirmed my theory that the problem is the --data-only flag. The way Zsolt is doing this, pg_restore is making no attempt at all to order the tables in a way that respects the FK relationships. There is code for that, but it's on the pg_dump side, and it's only executed if you said --data-only during the *dump* operation: * getTableDataFKConstraints - * add dump-order dependencies reflecting foreign key constraints * * This code is executed only in a data-only dump --- in schema+data dumps * we handle foreign key issues by not creating the FK constraints until * after the data is loaded. In a data-only dump, however, we want to * order the table data objects in such a way that a table's referenced * tables are restored first. (In the presence of circular references or * self-references this may be impossible; we'll detect and complain about * that during the dependency sorting step.) This is not terribly friendly (and certainly not documented at the user level). I wonder how hard it would be to improve matters. Offhand, it seems like it might be possible to run the same code in pg_restore, when it's told to do a data-only restore; however pg_restore does not do dependency sorting, so I'm not sure if it could figure out how to cope with circular FK dependencies. In the meantime, the options are (1) don't do it like that, or (2) use --disable-triggers and just trust that the FK constraints are satisfied. Or maybe (3) drop the FK constraints and then recreate/revalidate them after the data restore. regards, tom lane
I didn't state why I'm doing all this: basically all I'd like to do is exclude a table when restoring data.
The only way I managed to do this was to grep -v the list file before data only restore.
On 2020. Nov 21., Sat at 3:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Before doing that, have you positively confirmed that map_id=112664 exists
> on the maps table in the live database?
Nah, there's no reason to think there's any corruption problem. After
digging in the code a bit, I confirmed my theory that the problem is
the --data-only flag. The way Zsolt is doing this, pg_restore is
making no attempt at all to order the tables in a way that respects
the FK relationships. There is code for that, but it's on the pg_dump
side, and it's only executed if you said --data-only during the *dump*
operation:
* getTableDataFKConstraints -
* add dump-order dependencies reflecting foreign key constraints
*
* This code is executed only in a data-only dump --- in schema+data dumps
* we handle foreign key issues by not creating the FK constraints until
* after the data is loaded. In a data-only dump, however, we want to
* order the table data objects in such a way that a table's referenced
* tables are restored first. (In the presence of circular references or
* self-references this may be impossible; we'll detect and complain about
* that during the dependency sorting step.)
This is not terribly friendly (and certainly not documented at the
user level). I wonder how hard it would be to improve matters.
Offhand, it seems like it might be possible to run the same code
in pg_restore, when it's told to do a data-only restore; however
pg_restore does not do dependency sorting, so I'm not sure if it
could figure out how to cope with circular FK dependencies.
In the meantime, the options are (1) don't do it like that, or
(2) use --disable-triggers and just trust that the FK constraints
are satisfied. Or maybe (3) drop the FK constraints and then
recreate/revalidate them after the data restore.
regards, tom lane
On 2020-Nov-21, Zsolt Ero wrote: > I didn't state why I'm doing all this: basically all I'd like to do is > exclude a table when restoring data. > The only way I managed to do this was to grep -v the list file before data > only restore. Why do you do that? It seems much easier to produce a complete dump, then obtain the --list from it, do "grep -v" of the TABLE DATA element for that table, then give that file to pg_restore. It would restore everything in the right order, including that table's definition, but excluding that table's data.
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Why do you do that? It seems much easier to produce a complete dump, > then obtain the --list from it, do "grep -v" of the TABLE DATA element > for that table, then give that file to pg_restore. It would restore > everything in the right order, including that table's definition, but > excluding that table's data. Yeah, it's clearly possible to dodge the issue by using a different dump/restore procedure. Still, this procedure is not obviously incorrect, so it'd be nice if pg_restore coped better. regards, tom lane
Why do you do that? It seems much easier to produce a complete dump,
then obtain the --list from it, do "grep -v" of the TABLE DATA element
for that table, then give that file to pg_restore. It would restore
everything in the right order, including that table's definition, but
excluding that table's data.
Can you help me how can I do this?
Here is what I was doing before, which triggered the bug:
pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE
pg_restore --list $BACKUP_FILE > dump_list.full
grep -v "public events_map" dump_list.full > dump_list.main
pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4 $BACKUP_FILE
If it's not clear, I'd like to restore the original database 1:1, but without the data included in the "events_map" table. I mean the results should be the same as if I'd run truncate after import.
Zsolt
On 2020-Nov-22, Zsolt Ero wrote: > Can you help me how can I do this? > > Here is what I was doing before, which triggered the bug: > > pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE > > pg_restore --list $BACKUP_FILE > dump_list.full > grep -v "public events_map" dump_list.full > dump_list.main > > pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4 > $BACKUP_FILE I'd do this: pg_restore --list $BACKUP_FILE | grep -v "TABLE DATA public events_map" > dump_list pg_restore -j4 --use_list=dump_list $BACKUP_FILE
I get it. But then I need a separate step for the missing table's schema, right?
On 2020. Nov 22., Sun at 15:45, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2020-Nov-22, Zsolt Ero wrote:
> Can you help me how can I do this?
>
> Here is what I was doing before, which triggered the bug:
>
> pg_restore --schema-only --format=custom --single-transaction $BACKUP_FILE
>
> pg_restore --list $BACKUP_FILE > dump_list.full
> grep -v "public events_map" dump_list.full > dump_list.main
>
> pg_restore --data-only --format=custom --use-list=dump_list.main --jobs=4
> $BACKUP_FILE
I'd do this:
pg_restore --list $BACKUP_FILE | grep -v "TABLE DATA public events_map" > dump_list
pg_restore -j4 --use_list=dump_list $BACKUP_FILE
On 2020-Nov-22, Zsolt Ero wrote: > I get it. But then I need a separate step for the missing table's schema, > right? No, you don't. grep-v'ing for just TABLE DATA excludes the data, but not the definition. Please don't top-post.
No, you don't. grep-v'ing for just TABLE DATA excludes the data, but
not the definition.
Thanks, I understand it and it works perfectly like that!