Re: Performance issues during pg_restore -j with big partitioned table - Mailing list pgsql-general
From | Dimitrios Apostolou |
---|---|
Subject | Re: Performance issues during pg_restore -j with big partitioned table |
Date | |
Msg-id | 15c7f5eb-bd50-ee07-24af-c209694ddca7@gmx.net Whole thread Raw |
In response to | Re: Performance issues during pg_restore -j with big partitioned table (Dimitrios Apostolou <jimis@gmx.net>) |
List | pgsql-general |
For the record, I haven't seen this deadlock again. I guess it was a bug on the master branch that got fixed, because I've been testing later versions a few times. Dimitris On Fri, 4 Apr 2025, Dimitrios Apostolou wrote: > On Wed, 2 Apr 2025, Dimitrios Apostolou wrote: > >> Hello list. >> >> My database includes one table with 1000 partitions, all of them rather > > I was not clear here: my database dump has all that, and the database is > brand new and empty. > >> sizeable. I run: >> >> pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error >> --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump >> >> Right now after 24h of restore, I notice weird behaviour, so I have >> several questions about it: >> >> + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting". >> I see that they are waiting to issue a TRUNCATE for one of the >> partitions and then COPY data to it. Checking the log I see that >> several partitions have already been copied finished, but many more >> are left to start. >> >> Why is a TRUNCATE needed at the start of a partition's COPY phase? I >> didn't issue a --clean on the command line (I don't need it as my >> database is newly created), and I don't see a mention of related >> TRUNCATE in the pg_restore manual. >> >> + 1 postgres backend process is doing: >> >> ALTER TABLE the_master_partitioned_table >> ADD CONSTRAINT ... >> FOREIGN KEY (columnX) REFERENCES another_table(columnX) >> >> According to my logs this started right after COPY DATA for >> another_table was finished. And apparently it has a lock on >> the_master_partitioned_table that all other TRUNCATE have to wait for. >> >> Is this a bug in the dependency resolution? Wouldn't it make sense for >> this to wait until all 1000 partitions have finished their COPY DATA >> phase? > > Trying again, pg_restore exited with error after almost 24h: > > pg_restore: while PROCESSING TOC: > pg_restore: from TOC entry 8904; 2606 16529 CONSTRAINT test_runs_raw > test_runs_raw_partitioned_pkey > pg_restore: error: could not execute query: ERROR: deadlock detected > DETAIL: Process 465409 waits for AccessExclusiveLock on relation 44437 of > database 44090; blocked by process 465408. > Process 465408 waits for AccessShareLock on relation 44383 of database 44090; > blocked by process 465409. > HINT: See server log for query details. > > From the logs I see that: > > + Process 465409 waits for AccessExclusiveLock on relation 44437 of database > 44090; blocked by process 465408. > --> 44437 is test_runs_raw__part_max10120k (a single partition) > + Process 465408 waits for AccessShareLock on relation 44383 of database > 44090; blocked by process 465409. > --> 44383 is test_runs_raw (the master partitioned table) > > Process 465409: > ALTER TABLE ONLY public.test_runs_raw > ADD CONSTRAINT test_runs_raw_partitioned_pkey PRIMARY KEY (workitem_n, > run_n); > > Process 465408: > COPY public.test_runs_raw__part_max10120k(...) FROM stdin; > > > Bug? This happened on a postgres compiled from last week's master branch. The > dump I'm trying to restore is from postgres 17.4. > > > Thanks > Dimitris > >
pgsql-general by date: