Thread: pg_restore questions

pg_restore questions

From
William Garrison
Date:
I have been optimizing my pg_restores (postgres 8.2.9 on Windows) and I
am confused by some of the results I get when combining various
command-line options.

The -c option for "clean" does not do DROP IF EXISTS statements, it just
does DROP.  This results in an error if the object does not exist.  So
the -c option creates the requirement that the schema must already
exist.  Was that intentional?  This means that -c is incompatible with
-1 (single transaction) unless the existing matches the schema of the
database that was dumped since because it won't ignore errors if -1 is
specified.  Which means I lose my optimizations (a prior thread
concluded that -1 is necessary for the COPY optimization during restores)

The -C option for "create" does not work with -1 (single transaction),
since it results in an error stating that CREATE DATABASE commands
cannot be part of a transaction.  It seems to me that the pg_restore
command should know this, and create the database first, then start the
transaction.

Another problem with -C is that if I haven't created the database
already, it gives an error that it doesn't exist.  I thought that -C was
supposed to create the database for me.  It seems like it checks if the
database exists first.  Is that because I am using the -d option?
(Didn't try removing that, and my restore is now running...)  Maybe -d
checks for the database before -C can create it?  In that case, -C
should have complained when it tried to create a database that was
already there.  Either way, I seem to have to manually create the
database before running pg_restore.

Example:
Z:\Program Files\PostgreSQL\8.2\bin>pg_restore -d SpareFiles -v -C -s -U
postgres z:\teb01-bck01_sprfil_091808.backup
pg_restore: connecting to database for restore
pg_restore: [archiver (db)] connection to database "SpareFiles" failed:
FATAL:
database "SpareFiles" does not exist
pg_restore: *** aborted because of error

I realized that I need to do the restore in two steps: one to create the
schema, and another to restore the data.  This will allow me to create
the database from scratch, without relying on -c to drop things, then
after it creates the schema I can load the data using -1 for speed.  So
I manually created the database, and did a pg_restore with -C -s.  But
when I tried to do a data-only restore with -a, it complained about the
foreign key constraints: (I removed the table names and stuff since it
is under NDA)

Z:\Program Files\PostgreSQL\8.2\bin>pg_restore -d SpareFiles -v -1 -a -U
postgres z:\teb01-bck01_sprfil_091808.backup
pg_restore: connecting to database for restore
pg_restore: executing SEQUENCE SET <scrubbed_some_sequence>
pg_restore: executing SEQUENCE SET <scrubbed_some_sequence>
pg_restore: executing SEQUENCE SET <scrubbed_some_sequence>
pg_restore: executing SEQUENCE SET <scrubbed_some_sequence>
pg_restore: executing SEQUENCE SET <scrubbed_some_sequence>
pg_restore: restoring data for table "<scrubbed>"
pg_restore: restoring data for table "<scrubbed>"
pg_restore: restoring data for table "<scrubbed>"
pg_restore: restoring data for table "<scrubbed>"
pg_restore: restoring data for table "<scrubbed_final_table>"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1811; 0 16640 TABLE
DATA <scrubbed_final_table> postgres
pg_restore: [archiver (db)] COPY failed: ERROR:  insert or update on
table "<scrubbed_final_table>" violates foreign key constraint
"<scrubbed_fkey_to_another_table>"
DETAIL:  Key (<scrubbed_column_name>)=(<some_value>) is not present in
table "<scrubbed_table_name>".pg_restore: *** aborted because of error

This dump was done on postgres 8.2.9, and old forum posts indicate that
8.0 and beyond order data so that foreign key constraints are not a
problem in data-only restores.  Is this a bug?  There are no circular
dependencies in this database.

I got around that with the --disable-triggers option.  That is very
useful, but it wasn't clear from the documentation that this option also
affected foreign key constraints.  The documentation does say
"referential integrity checks" but I assumed that meant checks that I
created through triggers, not checks done with normal foreign keys.

So right now, I have my restore going on.  But I thought it was a bit
more difficult than it should have been.  I think that:
1) The behavior of the tool could be improved so that these conflicting
options won't conflict
2) If there is no way around it, the tool should state what conflicting
options have been selected, and they should be documented
3) I suspect the data-only restore issue with the foreign-keys is a bug
in the tool.  Any thoughts on this?

Thanks for reading this verbose message.  Can anyone clarify any of this?
- Bill

Re: pg_restore questions

From
Richard Huxton
Date:
William Garrison wrote:
> I have been optimizing my pg_restores (postgres 8.2.9 on Windows) and I
> am confused by some of the results I get when combining various
> command-line options.
>
> The -c option for "clean" does not do DROP IF EXISTS statements, it just
> does DROP.  This results in an error if the object does not exist.  So
> the -c option creates the requirement that the schema must already
> exist.  Was that intentional?  This means that -c is incompatible with
> -1 (single transaction) unless the existing matches the schema of the
> database that was dumped since because it won't ignore errors if -1 is
> specified.  Which means I lose my optimizations (a prior thread
> concluded that -1 is necessary for the COPY optimization during restores)

Never tried combining the two. In fact, I'm not sure I've ever used the
"clean" option in anger. You could check if that's still the case in 8.3
and if so either (a) raise a bug report or (b) raise a bug report and
supply a patch :-)

> The -C option for "create" does not work with -1 (single transaction),
> since it results in an error stating that CREATE DATABASE commands
> cannot be part of a transaction.  It seems to me that the pg_restore
> command should know this, and create the database first, then start the
> transaction.

That's reasonable.

> Another problem with -C is that if I haven't created the database
> already, it gives an error that it doesn't exist.  I thought that -C was
> supposed to create the database for me.  It seems like it checks if the
> database exists first.  Is that because I am using the -d option?
> (Didn't try removing that, and my restore is now running...)  Maybe -d
> checks for the database before -C can create it?  In that case, -C
> should have complained when it tried to create a database that was
> already there.  Either way, I seem to have to manually create the
> database before running pg_restore.

The -d option is the database to connect to, so if you're using -C too
you need to do something like:
  pg_restore -U postgres -d existing_db -C new_db
It is covered in the manuals, but there are a lot of options, so it
would be easy to miss.

> I realized that I need to do the restore in two steps: one to create the
> schema, and another to restore the data.  This will allow me to create
> the database from scratch, without relying on -c to drop things, then
> after it creates the schema I can load the data using -1 for speed.  So
> I manually created the database, and did a pg_restore with -C -s.  But
> when I tried to do a data-only restore with -a, it complained about the
> foreign key constraints: (I removed the table names and stuff since it
> is under NDA)

You're better off just creating an empty db and doing the schema+data at
the same time. Doing the schema first will mean all your indexes,
triggers etc. are in place. Doing both together lets it build indexes in
one go after the data is in place.

Try the combination of -d/-C as described above, I think that's what
you're after.

--
  Richard Huxton
  Archonet Ltd