Re: Experience and feedback on pg_restore --data-only - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Experience and feedback on pg_restore --data-only |
Date | |
Msg-id | d4c27299-11c8-4950-b0c1-bf24b1e4f353@aklaver.com Whole thread Raw |
In response to | Re: Experience and feedback on pg_restore --data-only (Dimitrios Apostolou <jimis@gmx.net>) |
List | pgsql-general |
On 3/24/25 08:51, Dimitrios Apostolou wrote: > On Mon, 24 Mar 2025, Adrian Klaver wrote: > >> On 3/24/25 07:24, Dimitrios Apostolou wrote: >>> On Sun, 23 Mar 2025, Laurenz Albe wrote: >>> >>>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote: >>>>> Performance issues: (important as my db size is >5TB) >>>>> >>>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite >>>>> having >>>>> setting wal_level=minimal. I even wrote my own function to >>>>> ALTER all >>>>> tables to UNLOGGED, but failed with "could not change table T to >>>>> unlogged because it references logged table". I'm out of ideas on >>>>> this >>>>> one. >>>> >>>> You'd have to create an load the table in the same transaction, >>>> that is, >>>> you'd have to run pg_restore with --single-transaction. >>> >>> That would restore the schema from the dump, while I want to create the >>> schema from the SQL code in version control. >> >> >> I am not following, from your original post: >> >> " >> ... create a >> clean database by running the SQL schema definition from version >> control, and >> then copy the data for only the tables created. >> >> For this case, I choose to run pg_restore --data-only, and run it as >> the user >> who owns the database (dbowner), not as a superuser, in order to avoid >> changes being introduced under the radar. >> " >> >> You are running the process in two steps, where the first does not >> involve >> pg_restore. Not sure why doing the pg_restore --data-only portion in >> single >> transaction is not possible? > > Laurenz informed me that I could avoid writing to the WAL if I "create and > load the table in a single transaction". From here: https://www.postgresql.org/message-id/455d28421ae33c73b73a6f527d2f72816ca5dd29.camel%40cybertec.at What he said was: "You'd have to create an load the table in the same transaction, that is, you'd have to run pg_restore with --single-transaction." Where I assume he meant '... create and load ...'. That is not the same as what you are doing below. > I haven't tried, but here is what I would do to try --single-transaction: > > Transaction 1: manually issuing all of CREATE TABLE etc. > > Transaction 2: pg_restore --single-transaction --data-only > > The COPY command in transaction 2 would still need to write to WAL, since > it's separate from the CREATE TABLE. > > Am I wrong somewhere? > >> COPY is all or none(version 17+ caveat(see >> https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so >> if the >> data dump fails in --single-transaction everything rolls back. > > So if I restore all tables, then an error about a "table not found" would > not roll back already copied tables, since it's not part of a COPY? If you are following what you show above then the tables and other objects would be created manually from the version control outside of pg_restore and on successful completion and commit of that transaction they would persist until such time as you change them. The second step pg_restore --single-transaction --data-only is where you could 99% of the way through and have a failure that rolls back all the data entered in the tables. > > > Thank you for the feedback, > Dimitris > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: