Thread: BUG #2308: pg_dump -a does not respect referential dependencies
The following bug has been logged online: Bug reference: 2308 Logged by: Matthew George Email address: georgema@corp.earthlink.net PostgreSQL version: 8.0.3 Operating system: Darwin Kernel Version 8.5.0: Sun Jan 22 10:38:46 PST 2006; root:xnu-792.6.61.obj~1/RELEASE_PPC Power Macintosh powerpc Description: pg_dump -a does not respect referential dependencies Details: When using pg_dump to dump a database, the schema and data are arranged within the dump so that it may be imported without violating referential integrity. When using the -a option to get a data-only dump, the data is ordered in the dump alphabetically by table. If a new schema is loaded into a fresh database, the output from the data-only dump cannot be imported via \i in psql without manually editing the dump file and reordering the inserts / copies such that dependent tables have their data loaded before the tables that depend on them. This is inconvenient at best. Since the logic obviously exists within pg_dump already to arrange the data in the correct order of reference dependencies, can this be added to the code path for `pg_dump -a` as well? Or can another option be added that specifies alphabetic vs. dependency order? This would really cut down the time it takes to do schema upgrades on projects that I work with. Thanks
"Matthew George" <georgema@corp.earthlink.net> writes: > Since the logic obviously exists within pg_dump already to arrange the data > in the correct order of reference dependencies, can this be added to the > code path for `pg_dump -a` as well? No. In a data-only restore there may not be *any* ordering that works --- consider circular dependencies. The best practice is to do standard schema+data dumps, wherein the ordering problem can be handled properly by not creating the FK constraints until after the data is loaded. If you really want to do a data-only restore, I'd suggest dropping the FK constraints and re-adding them afterwards (which will be a lot faster than row-by-row retail checks would be, anyway). Another possibility is the --disable-triggers option, but I can't really recommend that, because if there are any referential problems in the data you load, that way will fail to catch it. regards, tom lane
On Mar 8, 2006, at 3:01 PM, Tom Lane wrote: > "Matthew George" <georgema@corp.earthlink.net> writes: > The best practice is to do standard schema+data dumps, wherein the > ordering problem can be handled properly by not creating the FK > constraints until after the data is loaded. If you really want to > do a data-only restore, I'd suggest dropping the FK constraints > and re-adding them afterwards (which will be a lot faster than > row-by-row retail checks would be, anyway). BTW, it would be really nice if we provided a better way to do this than manually dropping all the FK constraints and adding them back in later. Would it be difficult to allow deferring all constraints in the database during a specified transaction? That would allow for loading the data in a transaction and doing the constraint checking later... -- Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
Jim Nasby <decibel@decibel.org> writes: > BTW, it would be really nice if we provided a better way to do this > than manually dropping all the FK constraints and adding them back in > later. Would it be difficult to allow deferring all constraints in > the database during a specified transaction? That would allow for > loading the data in a transaction and doing the constraint checking > later... You can try SET CONSTRAINTS ALL DEFERRED, but that only works for constraints that are declared deferrable, which by default FK constraints are not (stupid but that's what the spec requires). In any case this would still have performance issues because the behavior is tuned for transactions that update relatively small numbers of rows. Drop/add constraint is a lot better choice in the context of a bulk load. I was toying just now with the idea of a pg_dump mode that would issue the drop and re-add constraint commands for you. This would only help for constraints that pg_dump knows of (ie were in the source database), not any random new FK constraints that might be in the DB you are loading into, but it'd sure beat doing it manually. regards, tom lane