Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized - Mailing list pgsql-performance
From | Dimitrios Apostolou |
---|---|
Subject | Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized |
Date | |
Msg-id | q479op22-n5pp-n434-n25n-44or777p5r80@tzk.arg Whole thread Raw |
In response to | Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized (Stepan Neretin <slpmcf@gmail.com>) |
List | pgsql-performance |
Hello Stepan! I can see you tested my patch by itself, and with the following command: On Monday 2025-07-21 07:58, Stepan Neretin wrote: > pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists --freeze "$DUMP_FILE" > /dev/null On the other hand, I have tested combining --freeze with --data-only --clean, as implemented by another patch of mine at: https://www.postgresql.org/message-id/flat/413c1cd8-1d6d-90ba-ac7b-b226a4dad5ed%40gmx.net#44f06ce85a14a6238125a0ad8c9767db In fact I was considering that patch as a requirement to this one, but I see that you are using my patch independently to issue --clean --freeze without --data-only. As far as I understand, this will also issue a TRUNCATE before each COPY, and the benefits of FREEZE will be the same. Thank you for revealing this to me. >The main area of concern is the implementation method in `pg_backup_archiver.c`. The patch introduces a `do_copy` functionthat modifies the `COPY` statement string to inject the `WITH (FREEZE)` option. > >```c >if (cp_end - 10 > cp && > strncmp(cp_end - 10, "FROM stdin", 10) == 0 > ) > cp_is_well_formed = true; >``` > >This approach of string parsing is quite fragile. It makes a strong assumption that the `COPY` statement generated by `pg_dump`will always end with `... FROM stdin;` (preceded by optional whitespace). While this is true now, it creates a tightand brittle coupling between `pg_dump`'s output format and >`pg_restore`'s parsing logic. Any future changes to `pg_dump` that might add options or slightly alter the `COPY` syntaxcould break this feature silently or with a non-obvious warning. It troubled me too. At least I have a pg_log_warning() message in this case, and nothing detrimental will happen, it's just that the FREEZE option will not be used and the message will be logged. > >A more robust solution would be to avoid string manipulation entirely. `pg_restore` should assemble the `COPY` command fromits constituent parts (table name, column list, etc.) and then conditionally append the `WITH (FREEZE)` clause beforethe final `FROM stdin;`. This would decouple the feature from >the exact string representation in the dump archive. Totally agree, but this sounds impossible to implement. Unfortunately pg_dump generates SQL commands, and pg_restore edits them and executes them. I wouldn't know where to start to change this whole architecture. > >An alternative—and arguably cleaner—approach might be to shift this logic to pg_dump. Thinking it more thoroughly, pg_dump is not the place to compose the restoration SQL commands. The proper place is pg_restore. I guess it's being done in pg_dump for historical reasons, but I don't think pg_dump should complicate its commands further (e.g. by adding WITH FREEZE), as this is a choice to make during restore. I think I've seen more places in pg_restore that edit the commands before executing them. >One important consideration that needs to be highlighted in the documentation for this feature is the impact on WAL generation.`COPY ... WITH (FREEZE)` is known to generate significantly more WAL traffic because it must log the freezingof tuples, which can be a surprise for users. Maybe we can insert >already frozen pages? There should be no WAL traffic at all. If the transaction starts with a TRUNCATE TABLE then COPY skips the wal completely. This gives a big performance and stability gain when bulk-loading data, and this is what my 2 patches try to leverage in more cases. And as far as I can tell, if there is no TRUNCATE in the same transaction, then pg_restore will output error like the following: ERROR: cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction I hope such an erroris acceptable, since the sysadmin can just remove --freeze then and re-run the command. >Additionally, it should be noted that the freeze option only works correctly when performing a fresh load of data into emptytables. Do you think this patch has chances of going in by itself? If so then yes, I should definitely update the docs and submit it again individually. Thank you for the rebase and review. Dimitris
pgsql-performance by date:
Previous
From: Stepan NeretinDate:
Subject: Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized