Re: Experience and feedback on pg_restore --data-only - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Experience and feedback on pg_restore --data-only
Date
Msg-id 9e8852ec-d8fa-4fb6-a2d3-cd188ce0744a@aklaver.com
Whole thread Raw
In response to Experience and feedback on pg_restore --data-only  (Dimitrios Apostolou <jimis@gmx.net>)
Responses Re: Experience and feedback on pg_restore --data-only
List pgsql-general
On 3/20/25 15:48, Dimitrios Apostolou wrote:
> Rationale:
> 
> When restoring a backup in an emergency situation, it's fine to run
> pg_restore as superuser and get an exact replica of the dumped db.
> AFAICT pg_restore (without --data-only) is optimised for such case.
> 
> But pg_dump/restore can be used as a generic data-copying utility, and in
> those cases it makes often sense to get rid of the churn and create a
> clean database by running the SQL schema definition from version control,
> and then copy the data for only the tables created.
> 
> For this case, I choose to run pg_restore --data-only, and run it as the
> user who owns the database (dbowner), not as a superuser, in order to
> avoid changes being introduced under the radar.
> 
> Things that made my life hard:
> 
> * plenty of permission denials for both ALTER OWNER or SET SESSION
>    AUTHORIZATION (depending on command line switches).  Both of these
>    require superuser privilege, but in my case this is not really needed.
>    Dbowner has CREATEROLE and is the one who creates all the roles (WITH
>    SET TRUE), and their private schemata in the specific database.  Things
>    would work if pg_restore did "SET ROLE" instead of "SET SESSION
>    AUTHORIZATION" to switch user. Is this a straightforward change or there
>    are issues I don't see?

If this is --data-only what are the ALTER OWNER and SET SESSION 
AUTHORIZATION for?

> 
> * After each failed attempt, I need to issue a TRUNCATE table1,table2,...
>    before I try again.  I wrote my own function for that. It would help if
>    pg_restore would optionally truncate before COPY.  I believe it would
>    require superuser privilege for it, that could achieve using the
>    --superuser=username option used today for disabling the triggers.

That is what --clean is for, though it needs to have the objects(tables) 
be in the restore e.g. not just --data-only.


> 
> Performance issues: (important as my db size is >5TB)
> 
> * WAL writes: I didn't manage to avoid writing to the WAL, despite having
>    setting wal_level=minimal. I even wrote my own function to ALTER all
>    tables to UNLOGGED, but failed with "could not change table T to
>    unlogged because it references logged table".  I'm out of ideas on this
>    one.
> 
> * Indices: Could pg_restore have a switch to DROP indices before each
>    COPY, and re-CREATE them after, exactly as they were?  This would speed
>    up the process quite a bit.
> 
> 
> Any feedback for improving my process? Should I put these ideas somewhere
> as ideas for improvement on pg_restore?
> 
> Thank you in advance,
> Dimitris
> 
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Need help understanding has_function_privilege
Next
From: Laurenz Albe
Date:
Subject: Re: Experience and feedback on pg_restore --data-only