Thread: Performance issues during pg_restore -j with big partitioned table
Hello list. My database includes one table with 1000 partitions, all of them rather sizeable. I run: pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump Right now after 24h of restore, I notice weird behaviour, so I have several questions about it: + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting". I see that they are waiting to issue a TRUNCATE for one of the partitions and then COPY data to it. Checking the log I see that several partitions have already been copied finished, but many more are left to start. Why is a TRUNCATE needed at the start of a partition's COPY phase? I didn't issue a --clean on the command line (I don't need it as my database is newly created), and I don't see a mention of related TRUNCATE in the pg_restore manual. + 1 postgres backend process is doing: ALTER TABLE the_master_partitioned_table ADD CONSTRAINT ... FOREIGN KEY (columnX) REFERENCES another_table(columnX) According to my logs this started right after COPY DATA for another_table was finished. And apparently it has a lock on the_master_partitioned_table that all other TRUNCATE have to wait for. Is this a bug in the dependency resolution? Wouldn't it make sense for this to wait until all 1000 partitions have finished their COPY DATA phase? + Investigating why the above ALTER TABLE takes so long, I notice that it is issuing a lot of writes to the WAL. Digging deeper shows a lot of time spent in SetHintBits(). Is there a way to avoid that in a clean pg_restore? Thanks in advance, Dimitris
On 4/2/25 10:32 AM, Dimitrios Apostolou wrote: > Hello list. > > My database includes one table with 1000 partitions, all of them rather > sizeable. I run: > > pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error > --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump > > Right now after 24h of restore, I notice weird behaviour, so I have > several questions about it: > > + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting". > I see that they are waiting to issue a TRUNCATE for one of the > partitions and then COPY data to it. Checking the log I see that > several partitions have already been copied finished, but many more > are left to start. > > Why is a TRUNCATE needed at the start of a partition's COPY phase? I > didn't issue a --clean on the command line (I don't need it as my > database is newly created), and I don't see a mention of related > TRUNCATE in the pg_restore manual. --clean will drop the object entirely not TRUNCATE. I'm guessing that this is being done by you per: https://www.postgresql.org/message-id/53760c70-4a87-a453-9e02-57abc9cb2e54%40gmx.net "After each failed attempt, I need to issue a TRUNCATE table1,table2,... before I try again. " > > > Thanks in advance, > Dimitris > > -- Adrian Klaver adrian.klaver@aklaver.com
On 4/2/25 10:39 AM, Adrian Klaver wrote: > > --clean will drop the object entirely not TRUNCATE. > > I'm guessing that this is being done by you per: > > https://www.postgresql.org/message-id/53760c70-4a87-a453-9e02-57abc9cb2e54%40gmx.net > > "After each failed attempt, I need to issue a TRUNCATE table1,table2,... > before I try again. " Oops, forgot to engage brain. From pg_backup_archiver.c: * In parallel restore, if we created the table earlier in * this run (so that we know it is empty) and we are not * restoring a load-via-partition-root data item then we * wrap the COPY in a transaction and precede it with a * TRUNCATE. If wal_level is set to minimal this prevents * WAL-logging the COPY. This obtains a speedup similar * to that from using single_txn mode in non-parallel * restores. * * We mustn't do this for load-via-partition-root cases * because some data might get moved across partition * boundaries, risking deadlock and/or loss of previously * loaded data. (We assume that all partitions of a * partitioned table will be treated the same way.) > >> > >> >> Thanks in advance, >> Dimitris >> >> > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Performance issues during pg_restore -j with big partitioned table
From
Dimitrios Apostolou
Date:
On Wed, 2 Apr 2025, Adrian Klaver wrote: > > > On 4/2/25 10:32 AM, Dimitrios Apostolou wrote: >> Hello list. >> >> My database includes one table with 1000 partitions, all of them rather >> sizeable. I run: >> >> pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error >> --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump >> >> Right now after 24h of restore, I notice weird behaviour, so I have >> several questions about it: >> >> + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting". >> I see that they are waiting to issue a TRUNCATE for one of the >> partitions and then COPY data to it. Checking the log I see that >> several partitions have already been copied finished, but many more >> are left to start. >> >> Why is a TRUNCATE needed at the start of a partition's COPY phase? I >> didn't issue a --clean on the command line (I don't need it as my >> database is newly created), and I don't see a mention of related >> TRUNCATE in the pg_restore manual. > > --clean will drop the object entirely not TRUNCATE. > > I'm guessing that this is being done by you per: > > https://www.postgresql.org/message-id/53760c70-4a87-a453-9e02-57abc9cb2e54%40gmx.net > > "After each failed attempt, I need to issue a TRUNCATE table1,table2,... > before I try again. " Thanks Adrian. I'm now testing restore without --data-only. All I'm doing prior to the above pg_restore command is "createdb -T template0 newdb". It's possible though that I'm missing something here, the whole thing is way more complicated than I expected... Dimitris
Re: Performance issues during pg_restore -j with big partitioned table
From
Dimitrios Apostolou
Date:
On Wed, 2 Apr 2025, Adrian Klaver wrote: > > > On 4/2/25 10:39 AM, Adrian Klaver wrote: >> > >> --clean will drop the object entirely not TRUNCATE. >> >> I'm guessing that this is being done by you per: >> >> https://www.postgresql.org/message-id/53760c70-4a87-a453-9e02-57abc9cb2e54%40gmx.net >> >> "After each failed attempt, I need to issue a TRUNCATE table1,table2,... >> before I try again. " > > Oops, forgot to engage brain. > > From pg_backup_archiver.c: > > * In parallel restore, if we created the table earlier in > * this run (so that we know it is empty) and we are not > * restoring a load-via-partition-root data item then we > * wrap the COPY in a transaction and precede it with a > * TRUNCATE. If wal_level is set to minimal this prevents > * WAL-logging the COPY. This obtains a speedup similar > * to that from using single_txn mode in non-parallel > * restores. This makes sense. It creates the table earlier, and then truncates it just before copying data into it. I wonder if this really circumvents the WAL since I don't have --single-transaction (incompatible to -j). Dimitris
Re: Performance issues during pg_restore -j with big partitioned table
From
Dimitrios Apostolou
Date:
On Wed, 2 Apr 2025, Dimitrios Apostolou wrote: > Hello list. > > My database includes one table with 1000 partitions, all of them rather I was not clear here: my database dump has all that, and the database is brand new and empty. > sizeable. I run: > > pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error > --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump > > Right now after 24h of restore, I notice weird behaviour, so I have several > questions about it: > > + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting". > I see that they are waiting to issue a TRUNCATE for one of the > partitions and then COPY data to it. Checking the log I see that > several partitions have already been copied finished, but many more > are left to start. > > Why is a TRUNCATE needed at the start of a partition's COPY phase? I > didn't issue a --clean on the command line (I don't need it as my > database is newly created), and I don't see a mention of related > TRUNCATE in the pg_restore manual. > > + 1 postgres backend process is doing: > > ALTER TABLE the_master_partitioned_table > ADD CONSTRAINT ... > FOREIGN KEY (columnX) REFERENCES another_table(columnX) > > According to my logs this started right after COPY DATA for > another_table was finished. And apparently it has a lock on > the_master_partitioned_table that all other TRUNCATE have to wait for. > > Is this a bug in the dependency resolution? Wouldn't it make sense for > this to wait until all 1000 partitions have finished their COPY DATA > phase? Trying again, pg_restore exited with error after almost 24h: pg_restore: while PROCESSING TOC: pg_restore: from TOC entry 8904; 2606 16529 CONSTRAINT test_runs_raw test_runs_raw_partitioned_pkey pg_restore: error: could not execute query: ERROR: deadlock detected DETAIL: Process 465409 waits for AccessExclusiveLock on relation 44437 of database 44090; blocked by process 465408. Process 465408 waits for AccessShareLock on relation 44383 of database 44090; blocked by process 465409. HINT: See server log for query details. From the logs I see that: + Process 465409 waits for AccessExclusiveLock on relation 44437 of database 44090; blocked by process 465408. --> 44437 is test_runs_raw__part_max10120k (a single partition) + Process 465408 waits for AccessShareLock on relation 44383 of database 44090; blocked by process 465409. --> 44383 is test_runs_raw (the master partitioned table) Process 465409: ALTER TABLE ONLY public.test_runs_raw ADD CONSTRAINT test_runs_raw_partitioned_pkey PRIMARY KEY (workitem_n, run_n); Process 465408: COPY public.test_runs_raw__part_max10120k(...) FROM stdin; Bug? This happened on a postgres compiled from last week's master branch. The dump I'm trying to restore is from postgres 17.4. Thanks Dimitris
On 4/4/25 06:13, Dimitrios Apostolou wrote: > On Wed, 2 Apr 2025, Dimitrios Apostolou wrote: > > Bug? This happened on a postgres compiled from last week's master branch. Are you talking about the dev version? > The dump I'm trying to restore is from postgres 17.4. > > > Thanks > Dimitris > > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Performance issues during pg_restore -j with big partitioned table
From
Dimitrios Apostolou
Date:
On Fri, 4 Apr 2025, Adrian Klaver wrote: > On 4/4/25 06:13, Dimitrios Apostolou wrote: >> On Wed, 2 Apr 2025, Dimitrios Apostolou wrote: >> > >> Bug? This happened on a postgres compiled from last week's master branch. > > Are you talking about the dev version? In this thread, yes. My mistake I didn't mention it from the start. I compiled it in order to see if the problems I mentioned in my previous thread were fixed. Dimitris
On 4/4/25 08:09, Dimitrios Apostolou wrote: > On Fri, 4 Apr 2025, Adrian Klaver wrote: > >> On 4/4/25 06:13, Dimitrios Apostolou wrote: >>> On Wed, 2 Apr 2025, Dimitrios Apostolou wrote: >>> >> >>> Bug? This happened on a postgres compiled from last week's master >>> branch. >> >> Are you talking about the dev version? > > In this thread, yes. My mistake I didn't mention it from the start. I > compiled it in order to see if the problems I mentioned in my previous > thread were fixed. Since it still has not established that an actual problem exists in previous versions, this is premature. Especially as dev has not even reached alpha AFAIK. To get at a solution more detailed information is needed. I would also suggest testing using something less the 1000 large partitions. Information needed for any given report: 1) Postgres version being dumped from. 2) The complete pg_dump command. 3) The Postgres version being restored to. 4) The complete pg_restore command. 5) The error messages. Also have you tried a schema only dump/restore? > > > Dimitris > -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Apr 2, 2025 at 1:32 PM Dimitrios Apostolou <jimis@gmx.net> wrote:
Hello list.
My database includes one table with 1000 partitions, all of them rather
sizeable. I run:
pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump
Right now after 24h of restore, I notice weird behaviour, so I have
several questions about it:
+ 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting".
I see that they are waiting to issue a TRUNCATE for one of the
partitions and then COPY data to it. Checking the log I see that
several partitions have already been copied finished, but many more
are left to start.
Why is a TRUNCATE needed at the start of a partition's COPY phase? I
didn't issue a --clean on the command line (I don't need it as my
database is newly created), and I don't see a mention of related
TRUNCATE in the pg_restore manual.
TRUNCATE statements inside of "toc.dat" files? I'm skeptical.
Are you maybe doing something else in that database besides pg_restore?
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 4/4/25 9:45 AM, Ron Johnson wrote: > > TRUNCATE statements inside of "toc.dat" files? I'm skeptical. See my post here: https://www.postgresql.org/message-id/7be2dcc6-3ba4-4e3f-a154-8d13d816aa9b%40aklaver.com -- Adrian Klaver adrian.klaver@aklaver.com