On Thu, 20 Mar 2025, Dimitrios Apostolou wrote:
> Rationale:
>
> When restoring a backup in an emergency situation, it's fine to run
> pg_restore as superuser and get an exact replica of the dumped db.
> AFAICT pg_restore (without --data-only) is optimised for such case.
>
> But pg_dump/restore can be used as a generic data-copying utility, and in
> those cases it makes often sense to get rid of the churn and create a clean
> database by running the SQL schema definition from version control, and then
> copy the data for only the tables created.
>
> For this case, I choose to run pg_restore --data-only, and run it as the user
> who owns the database (dbowner), not as a superuser, in order to avoid
> changes being introduced under the radar.
Another important reason I use --data-only: tablespaces. The new host has
different storage configuration and tablespaces and the schema has been
updated to reflect those. Therefore I must create the database using the
updated schema definitions on the new host.
>
> Things that made my life hard:
>
> * plenty of permission denials for both ALTER OWNER or SET SESSION
> AUTHORIZATION (depending on command line switches). Both of these
> require superuser privilege, but in my case this is not really needed.
> Dbowner has CREATEROLE and is the one who creates all the roles (WITH
> SET TRUE), and their private schemata in the specific database. Things
> would work if pg_restore did "SET ROLE" instead of "SET SESSION
> AUTHORIZATION" to switch user. Is this a straightforward change or there
> are issues I don't see?
>
> * After each failed attempt, I need to issue a TRUNCATE table1,table2,...
> before I try again. I wrote my own function for that. It would help if
> pg_restore would optionally truncate before COPY. I believe it would
> require superuser privilege for it, that could achieve using the
> --superuser=username option used today for disabling the triggers.
>
> Performance issues: (important as my db size is >5TB)
>
> * WAL writes: I didn't manage to avoid writing to the WAL, despite having
> setting wal_level=minimal. I even wrote my own function to ALTER all
> tables to UNLOGGED, but failed with "could not change table T to
> unlogged because it references logged table". I'm out of ideas on this
> one.
>
> * Indices: Could pg_restore have a switch to DROP indices before each
> COPY, and re-CREATE them after, exactly as they were? This would speed
> up the process quite a bit.
>
>
> Any feedback for improving my process? Should I put these ideas somewhere as
> ideas for improvement on pg_restore?
>
> Thank you in advance,
> Dimitris
>
>