Re: Experience and feedback on pg_restore --data-only - Mailing list pgsql-general
From | Dimitrios Apostolou |
---|---|
Subject | Re: Experience and feedback on pg_restore --data-only |
Date | |
Msg-id | 7e990eae-e55c-0d04-1be8-f49bb3251073@gmx.net Whole thread Raw |
In response to | Re: Experience and feedback on pg_restore --data-only (Adrian Klaver <adrian.klaver@aklaver.com>) |
Responses |
Re: Experience and feedback on pg_restore --data-only
Re: Experience and feedback on pg_restore --data-only Re: Experience and feedback on pg_restore --data-only |
List | pgsql-general |
On Mon, 24 Mar 2025, Adrian Klaver wrote: > On 3/24/25 07:24, Dimitrios Apostolou wrote: >> On Sun, 23 Mar 2025, Laurenz Albe wrote: >> >>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote: >>>> Performance issues: (important as my db size is >5TB) >>>> >>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite >>>> having >>>> setting wal_level=minimal. I even wrote my own function to ALTER all >>>> tables to UNLOGGED, but failed with "could not change table T to >>>> unlogged because it references logged table". I'm out of ideas on >>>> this >>>> one. >>> >>> You'd have to create an load the table in the same transaction, that is, >>> you'd have to run pg_restore with --single-transaction. >> >> That would restore the schema from the dump, while I want to create the >> schema from the SQL code in version control. > > > I am not following, from your original post: > > " > ... create a > clean database by running the SQL schema definition from version control, and > then copy the data for only the tables created. > > For this case, I choose to run pg_restore --data-only, and run it as the user > who owns the database (dbowner), not as a superuser, in order to avoid > changes being introduced under the radar. > " > > You are running the process in two steps, where the first does not involve > pg_restore. Not sure why doing the pg_restore --data-only portion in single > transaction is not possible? Laurenz informed me that I could avoid writing to the WAL if I "create and load the table in a single transaction". I haven't tried, but here is what I would do to try --single-transaction: Transaction 1: manually issuing all of CREATE TABLE etc. Transaction 2: pg_restore --single-transaction --data-only The COPY command in transaction 2 would still need to write to WAL, since it's separate from the CREATE TABLE. Am I wrong somewhere? >> Something that might work, would be for pg_restore to issue a TRUNCATE >> before the COPY. I believe this would require superuser privelege though, >> that I would prefer to avoid. Currently I issue TRUNCATE for all tables >> manually before running pg_restore, but of course this is in a different >> transaction so it doesn't help. >> >> By the way do you see potential problems with using --single-transaction >> to restore billion-rows tables? > > COPY is all or none(version 17+ caveat(see > https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so if the > data dump fails in --single-transaction everything rolls back. So if I restore all tables, then an error about a "table not found" would not roll back already copied tables, since it's not part of a COPY? Thank you for the feedback, Dimitris
pgsql-general by date: