Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers
| From | Srinath Reddy Sadipiralla |
|---|---|
| Subject | Re: Adding REPACK [concurrently] |
| Date | |
| Msg-id | CAFC+b6o2yzA80YmfEhmMO9puN8qvGRvr-15BBLn3UmJxPfpr2w@mail.gmail.com Whole thread Raw |
| In response to | Re: Adding REPACK [concurrently] (Antonin Houska <ah@cybertec.at>) |
| Responses |
Re: Adding REPACK [concurrently]
|
| List | pgsql-hackers |
Hello,
I did stress testing on v35 patches, where I did concurrency test using
pgbench with 50 concurrent clients, 4 threads with the below pgbench
script (dual_chaos.sql) on the following table setup(setup.sql).
I ran pgbench with 5M rows for 10 minutes and 50M for ~45 minutes
multiple times. REPACK (concurrently) ran successfully except "once"(see below).
I created a shadow/clone table to use for checking the correctness after doing
I did stress testing on v35 patches, where I did concurrency test using
pgbench with 50 concurrent clients, 4 threads with the below pgbench
script (dual_chaos.sql) on the following table setup(setup.sql).
I ran pgbench with 5M rows for 10 minutes and 50M for ~45 minutes
multiple times. REPACK (concurrently) ran successfully except "once"(see below).
I created a shadow/clone table to use for checking the correctness after doing
the concurrency test.I used 4 checks to verify that data is intact and
REPACK (concurrently) ran successfully.
1) table file OID(relfilenode) swapped?
2) bloat gone? victim relation size should be less than
shadow relation size.
3) using FULL JOIN logic (borrowed from repack.spec, with small change)
REPACK (concurrently) ran successfully.
1) table file OID(relfilenode) swapped?
2) bloat gone? victim relation size should be less than
shadow relation size.
3) using FULL JOIN logic (borrowed from repack.spec, with small change)
against the shadow table which goes under the same concurrent ops
done on the victim table , basically doing dual writes (see dual_chaos.sql) to
verify table data integrity.
4) Physical Index Integrity (amcheck) (borrowed from Mihail's tests)
The concurrency test failed once. I tried to reproduce the below scenario
but no luck,i think the reason the assert failure happened because
after speculative insert there might be no spec CONFIRM or ABORT, thoughts?
TRAP: failed Assert("!specinsert"), File: "reorderbuffer.c", Line: 2610, PID: 3956168
postgres: REPACK decoding worker for relation "stress_victim" (ExceptionalCondition+0x98)[0xaaaab1251188]
postgres: REPACK decoding worker for relation "stress_victim" (+0x67b1cc)[0xaaaab0f4b1cc]
postgres: REPACK decoding worker for relation "stress_victim" (+0x67b86c)[0xaaaab0f4b86c]
postgres: REPACK decoding worker for relation "stress_victim" (ReorderBufferCommit+0x74)[0xaaaab0f4b8f0]
postgres: REPACK decoding worker for relation "stress_victim" (+0x66229c)[0xaaaab0f3229c]
postgres: REPACK decoding worker for relation "stress_victim" (xact_decode+0x1a0)[0xaaaab0f312bc]
postgres: REPACK decoding worker for relation "stress_victim" (LogicalDecodingProcessRecord+0xd4)[0xaaaab0f30e60]
postgres: REPACK decoding worker for relation "stress_victim" (+0x3372e4)[0xaaaab0c072e4]
postgres: REPACK decoding worker for relation "stress_victim" (+0x339634)[0xaaaab0c09634]
postgres: REPACK decoding worker for relation "stress_victim" (RepackWorkerMain+0x1ac)[0xaaaab0c094e8]
postgres: REPACK decoding worker for relation "stress_victim" (BackgroundWorkerMain+0x2b0)[0xaaaab0efc440]
postgres: REPACK decoding worker for relation "stress_victim" (postmaster_child_launch+0x1f0)[0xaaaab0f00398]
postgres: REPACK decoding worker for relation "stress_victim" (+0x639ca4)[0xaaaab0f09ca4]
postgres: REPACK decoding worker for relation "stress_victim" (+0x639f94)[0xaaaab0f09f94]
postgres: REPACK decoding worker for relation "stress_victim" (+0x638714)[0xaaaab0f08714]
postgres: REPACK decoding worker for relation "stress_victim" (+0x635978)[0xaaaab0f05978]
postgres: REPACK decoding worker for relation "stress_victim" (PostmasterMain+0x160c)[0xaaaab0f050c8]
postgres: REPACK decoding worker for relation "stress_victim" (main+0x3dc)[0xaaaab0d974d4]
/lib/aarch64-linux-gnu/libc.so.6(+0x284c4)[0xffff867584c4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0xffff86758598]
postgres: REPACK decoding worker for relation "stress_victim" (_start+0x30)[0xaaaab09bc1f0]
2026-02-19 18:20:56.088 IST [3905812] LOG: checkpoint starting: wal
2026-02-19 18:21:10.683 IST [3905808] LOG: background worker "REPACK decoding worker" (PID 3956168) was terminated by signal 6: Aborted
done on the victim table , basically doing dual writes (see dual_chaos.sql) to
verify table data integrity.
4) Physical Index Integrity (amcheck) (borrowed from Mihail's tests)
The concurrency test failed once. I tried to reproduce the below scenario
but no luck,i think the reason the assert failure happened because
after speculative insert there might be no spec CONFIRM or ABORT, thoughts?
TRAP: failed Assert("!specinsert"), File: "reorderbuffer.c", Line: 2610, PID: 3956168
postgres: REPACK decoding worker for relation "stress_victim" (ExceptionalCondition+0x98)[0xaaaab1251188]
postgres: REPACK decoding worker for relation "stress_victim" (+0x67b1cc)[0xaaaab0f4b1cc]
postgres: REPACK decoding worker for relation "stress_victim" (+0x67b86c)[0xaaaab0f4b86c]
postgres: REPACK decoding worker for relation "stress_victim" (ReorderBufferCommit+0x74)[0xaaaab0f4b8f0]
postgres: REPACK decoding worker for relation "stress_victim" (+0x66229c)[0xaaaab0f3229c]
postgres: REPACK decoding worker for relation "stress_victim" (xact_decode+0x1a0)[0xaaaab0f312bc]
postgres: REPACK decoding worker for relation "stress_victim" (LogicalDecodingProcessRecord+0xd4)[0xaaaab0f30e60]
postgres: REPACK decoding worker for relation "stress_victim" (+0x3372e4)[0xaaaab0c072e4]
postgres: REPACK decoding worker for relation "stress_victim" (+0x339634)[0xaaaab0c09634]
postgres: REPACK decoding worker for relation "stress_victim" (RepackWorkerMain+0x1ac)[0xaaaab0c094e8]
postgres: REPACK decoding worker for relation "stress_victim" (BackgroundWorkerMain+0x2b0)[0xaaaab0efc440]
postgres: REPACK decoding worker for relation "stress_victim" (postmaster_child_launch+0x1f0)[0xaaaab0f00398]
postgres: REPACK decoding worker for relation "stress_victim" (+0x639ca4)[0xaaaab0f09ca4]
postgres: REPACK decoding worker for relation "stress_victim" (+0x639f94)[0xaaaab0f09f94]
postgres: REPACK decoding worker for relation "stress_victim" (+0x638714)[0xaaaab0f08714]
postgres: REPACK decoding worker for relation "stress_victim" (+0x635978)[0xaaaab0f05978]
postgres: REPACK decoding worker for relation "stress_victim" (PostmasterMain+0x160c)[0xaaaab0f050c8]
postgres: REPACK decoding worker for relation "stress_victim" (main+0x3dc)[0xaaaab0d974d4]
/lib/aarch64-linux-gnu/libc.so.6(+0x284c4)[0xffff867584c4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0xffff86758598]
postgres: REPACK decoding worker for relation "stress_victim" (_start+0x30)[0xaaaab09bc1f0]
2026-02-19 18:20:56.088 IST [3905812] LOG: checkpoint starting: wal
2026-02-19 18:21:10.683 IST [3905808] LOG: background worker "REPACK decoding worker" (PID 3956168) was terminated by signal 6: Aborted
Crash Test:
i did crash test using debugger using a breakpoint inside apply_concurrent_changes
to simulate a crash while concurrent changes are being done, after few concurrent changes
are done , i crashed the server using "pg_ctl -m immediate stop", then restarted the server,
i observed that REPACK (concurrently) didn't completed (expected), files were not swapped and data
on the victim table is intact checked using FULL JOIN with shadow table, but there are
some leftovers of the transient table we used for REPACK (concurrently) such as
1) transient table's relation files - these consume extra space , i think this was the
case with VACUUM FULL previously, so these has to be removed manually , but
i did crash test using debugger using a breakpoint inside apply_concurrent_changes
to simulate a crash while concurrent changes are being done, after few concurrent changes
are done , i crashed the server using "pg_ctl -m immediate stop", then restarted the server,
i observed that REPACK (concurrently) didn't completed (expected), files were not swapped and data
on the victim table is intact checked using FULL JOIN with shadow table, but there are
some leftovers of the transient table we used for REPACK (concurrently) such as
1) transient table's relation files - these consume extra space , i think this was the
case with VACUUM FULL previously, so these has to be removed manually , but
I think this time we have a "leverage" which we can use to remove the extra space.
2) transient table's WALs - these are generated because of concurrent changes done while
applying the logical decoded changes on the new transient table, i think this won't be a problem
until they only will get recycled but if they get archived , they are of no use instead they
consume more space and time during the archival process.
"Leverage" Idea:
i think we can re-use these transient table's relation files and WALs during crash recovery,
so that user don't have to re-run the REPACK (concurrently) after server has recovered,
for this we might need to write a WAL for REPACK (concurrently) to let startup process
know REPACK (concurrently) occurred which sets a flag, so at the end of startup process
all the WALs of the transient table are already applied so transient table perfect now ,
at the end we can do swapping (finish_heap_swap) after checking the flag , these are
all my initial thoughts on this idea to reuse the "residue" files of the transient table.
I could be totally wrong :) Please correct me if I am.
i think we need to update this statement in repack.sgml regarding wal_level
<listitem>
<para>
The <link linkend="guc-wal-level"><varname>wal_level</varname></link>
configuration parameter is less than <literal>logical</literal>.
</para>
</listitem>
because of this commit POC: enable logical decoding when wal_level = 'replica' without a server restart (67c2097)
2) transient table's WALs - these are generated because of concurrent changes done while
applying the logical decoded changes on the new transient table, i think this won't be a problem
until they only will get recycled but if they get archived , they are of no use instead they
consume more space and time during the archival process.
"Leverage" Idea:
i think we can re-use these transient table's relation files and WALs during crash recovery,
so that user don't have to re-run the REPACK (concurrently) after server has recovered,
for this we might need to write a WAL for REPACK (concurrently) to let startup process
know REPACK (concurrently) occurred which sets a flag, so at the end of startup process
all the WALs of the transient table are already applied so transient table perfect now ,
at the end we can do swapping (finish_heap_swap) after checking the flag , these are
all my initial thoughts on this idea to reuse the "residue" files of the transient table.
I could be totally wrong :) Please correct me if I am.
i think we need to update this statement in repack.sgml regarding wal_level
<listitem>
<para>
The <link linkend="guc-wal-level"><varname>wal_level</varname></link>
configuration parameter is less than <literal>logical</literal>.
</para>
</listitem>
because of this commit POC: enable logical decoding when wal_level = 'replica' without a server restart (67c2097)
Attachment
pgsql-hackers by date: