On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson <daniel@yesql.se> wrote:
As there is no new patch submitted I will go ahead and do that, please feel free to resubmit when there is renewed interest in working on this.
Recently I restored a database from a directory format backup and having this feature would have been quite useful. So, I would like to resume the work on this patch, unless OP or someone else is already on it.
Hearing no one advising me otherwise I'll pick up the work on the patch. First, I'll try to address all previous comments.
On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote: > I noticed that referential integrity checks aren't currently > parallelized. Is it on purpose?
It's not 100% clear to me that it is safe. But on the other hand, it's also not 100% clear to me that it is unsafe.
Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was confident that nothing bad would happen, and this wasn't one of those places. It's something of a nested-query environment -- your criterion #6. How do we know that the surrounding query isn't already parallel? Perhaps because it must be DML, but I think it must be more important to support parallel DML than to support this.
Currently I don't see a scenario where the validation might run inside another parallel query, but I could be missing something. Suggestions on some edge cases to test are more than welcome.
We already have a case where parallel operations can occur when adding a table constraint, and that's the index creation for a primary key. Take for example:
postgres=# SET max_parallel_maintenance_workers TO 4; postgres=# SET parallel_setup_cost TO 0; postgres=# SET parallel_tuple_cost TO 0; postgres=# SET parallel_leader_participation TO 0; postgres=# SET min_parallel_table_scan_size TO 0; postgres=# SET client_min_messages TO debug1; postgres=# CREATE TABLE parallel_pk_table (a int) WITH (autovacuum_enabled = off); postgres=# ALTER TABLE parallel_pk_table ADD PRIMARY KEY (a); DEBUG: ALTER TABLE / ADD PRIMARY KEY will create implicit index "parallel_pk_table_pkey" for table "parallel_pk_table" DEBUG: building index "parallel_pk_table_pkey" on table "parallel_pk_table" with request for 1 parallel workers ... DEBUG: index "parallel_pk_table_pkey" can safely use deduplication DEBUG: verifying table "parallel_pk_table"
This should be better documented. Maybe "parallel index build" [1] could have its own section, so it can be cited from other sections?
I'm not sure what the right thing to do here is, but I think it would be good if your patch included a test case.
I have included a basic test for parallel ADD PRIMARY KEY and ADD FOREIGN KEY.
On Sat, Mar 19, 2022 at 1:57 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:
It would be valuable to add logging to ensure that the ActiveSnapshot and TransactionSnapshot is the same for the leader and the workers. This logging could be tested in the TAP test.
That machinery is used in any parallel query, I'm not sure this patch should be responsible for carrying that requirement.
Also, inside RI_Initial_Check you may want to set max_parallel_workers to max_parallel_maintenance_workers.
Currently the work_mem is set to maintenance_work_mem. This will also require a doc change to call out.