Thread: pg_dump --data-only: is dump ordered to keep foreign-key-relations loadable?

pg_dump --data-only: is dump ordered to keep foreign-key-relations loadable?

From
Harald Armin Massa
Date:
I migrated one database from 8.0 to 8.1

That I used to do add "without oids" to all tables.

First step so:

pg_dump --schema-only  -U user database

the file was edited, all tables to "withoud oids"; and reloaded in 8.1

After that I

pg_dump --data-only -U user database

and tried to reload the data. But it fails on foreign keys: depending tables are being dumped before the tables they depend on.

I solved it by manually dumping the relevant tables and reloading them,

Now I cannot find documentation
- if pg_dump is supposed to produce a "ordered dump" so that not doing is a bug and I need to present a showcase
- or if it is simply not implemented and an enhancement request;
- or if it is even on a theoretical basis impossible to derive the correct order. [circular foreign keys came to my mind]


Harald

--
GHUM Harald Massa
persuasion python postgresql
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607

Re: pg_dump --data-only: is dump ordered to keep foreign-key-relations

From
"Florian G. Pflug"
Date:
Harald Armin Massa wrote:
 > [snipped text]
> pg_dump --data-only -U user database
>
> and tried to reload the data. But it fails on foreign keys: depending tables
> are being dumped before the tables they depend on.
>
> I solved it by manually dumping the relevant tables and reloading them,
>
> Now I cannot find documentation
> - if pg_dump is supposed to produce a "ordered dump" so that not doing is a
> bug and I need to present a showcase
> - or if it is simply not implemented and an enhancement request;
> - or if it is even on a theoretical basis impossible to derive the correct
> order. [circular foreign keys came to my mind]

Hi

There are three possibilities to solve this
1) Use pg_dump (or pg_restore, if you are using custom-format dumps)
with the --disable-trigger option (check the man-page for the exact
syntax). This will disable all triggers, including those which check the
foreign-key contraints during the restore

2) If all your foreign keys are defined as "deferrable", you can
wrap the data-loading in a transaction, and do "set contraints all
deferred" before loading the data. This will defer the constraint-checks
until you issue commit.

3) If you have a schema and data-dump in seperate files, you could
manually split the schema dump into two files, one containing all
table definitions, the other containing the f-k definitions. You
can then first restore the schema, then your data, and finally your
foreign keys.

The ordering of the three options in terms of speed is 1 < 3 < 2, I
believe - but 2 and 3 give you additional security, because they check
the foreign keys during the import. 1) relies on the fact the the
dump doesn't containt foreign-key violations.

greetings, Florian Pflug

Attachment

Re: pg_dump --data-only: is dump ordered to keep foreign-key-relations

From
Peter Wilson
Date:
Harald Armin Massa wrote:
> I migrated one database from 8.0 to 8.1
>
> That I used to do add "without oids" to all tables.
>
> First step so:
>
> pg_dump --schema-only  -U user database
>
> the file was edited, all tables to "withoud oids"; and reloaded in 8.1
>
> After that I
>
> pg_dump --data-only -U user database
>
> and tried to reload the data. But it fails on foreign keys: depending
> tables are being dumped before the tables they depend on.
>
> I solved it by manually dumping the relevant tables and reloading them,
>
> Now I cannot find documentation
> - if pg_dump is supposed to produce a "ordered dump" so that not doing
> is a bug and I need to present a showcase
> - or if it is simply not implemented and an enhancement request;
> - or if it is even on a theoretical basis impossible to derive the
> correct order. [circular foreign keys came to my mind]

You can do this by creating a table of contents from your dump:
    pg_restore -l -a ... > toc_file ...

Edit this to re-order the contents such that they restore in a safe
order and then restore using the modified TOC:
    pg_restore -L toc_file ...

That works for me at least. It is a bit of a pain to have to do this.
Maybe there is an easier way?

Disabling triggers only affects triggers - not keys so doesn't help.

Hope that helps
Pete
--
Whitebeam Web Application Server
http://www.whitebeam.org
------

Re: pg_dump --data-only: is dump ordered to keep foreign-key-relations loadable?

From
Harald Armin Massa
Date:
Florian,

thank you very much for those solutions; I shortened them, resend them for "increased google presence" :)

1) Use [...] --disable-trigger option  [...]
2) foreign keys as "deferrable",  do "set contraints alldeferred" before loading the data.
3)  split the schema dump into two files, a) table definitions, b) f-k definitions. Restore a)- data-b)

The ordering of the three options in terms of speed is 1 < 3 < 2, I
believe - but 2 and 3 give you additional security, because they check
the foreign keys during the import. 1) relies on the fact the the
dump doesn't containt foreign-key violations.

That really helps me  to overcome my "dump the  affected tables  manually  and reload  them "  approach.
 
STILL ... to help on with the development:

- is pg_dump supposed to produce a "ordered dump" with --data-only?
(within documentation there is only a warning that --table will not dump the depending ones)
[than I have to create a extract to reproduce that "bug"]

- or if it is simply not implemented and an enhancement request;
(so I should file an enhancement-request and a "warning" -patch to the pg_dump docu, that pg_dump --dataonly MUST be used with --disable-trigger or similiar for backup-purposes)

- or if it is even on a theoretical basis impossible to derive the correct order. [circular foreign keys came to my mind] -> I also should submit a patch to documentation

Thanks for the solutions again, will really help in the upcoming 120 database reloads.

Harald

--
GHUM Harald Massa
persuasion python postgresql
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607

Re: pg_dump --data-only: is dump ordered to keep foreign-key-relations

From
"Florian G. Pflug"
Date:
Harald Armin Massa wrote:
> - is pg_dump supposed to produce a "ordered dump" with --data-only?
> (within documentation there is only a warning that --table will not dump
> the depending ones)
> [than I have to create a extract to reproduce that "bug"]
>
> - or if it is simply not implemented and an enhancement request;
> (so I should file an enhancement-request and a "warning" -patch to the
> pg_dump docu, that pg_dump --dataonly MUST be used with
> --disable-trigger or similiar for backup-purposes)
>
> - or if it is even on a theoretical basis impossible to derive the
> correct order. [circular foreign keys came to my mind] -> I also should
> submit a patch to documentation

Even _if_ it's theoretically possible, it would probably be too slow,
i'd say. I'd be possible if you dependency-graph between your tables is
acyclic - then simpily reordering the dump-order would be sufficient.

But if you have some sort of cycle, you'd at least need to give up
table-wise restoration, and instead e.g. restore 5 rows from t1, then
2 rows from t2, now 3 rows from _t1_ _again_. And if a _record_ some
way depends on itself (So, the dependency graph between you _records_
even shows circles), then you'd sometimes need to insert a record, then
insert a record that depends on the first, and then update the first
record to now depend on the second. or something similar. And, with
not-null contraints in place, this might be impossible without violating
them :-(.

So, I'd say you should either create backups using --disable-triggers,
or use the custom-format - which let's you specify --disable-triggers,
and a few other options at restore, rather then at backup time.

greetings, Florian Pflug