Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Srinath Reddy Sadipiralla
Subject Re: Adding REPACK [concurrently]
Date
Msg-id CAFC+b6qk3-DQTi43QMqvVLP+sudPV4vsLQm5iHfcCeObrNaVyA@mail.gmail.com
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Antonin Houska <ah@cybertec.at>)
Responses Re: Adding REPACK [concurrently]
Re: Adding REPACK [concurrently]
List pgsql-hackers
Hello,

While i was doing concurrency test onn V41 patches ,i found this crash because of the assert failure,

TRAP: failed Assert("RelationGetRelid(relation) == ((RepackDecodingState *) ctx->output_writer_private)->relid"), File: "pgoutput_repack.c", Line: 97, PID: 397007
postgres: REPACK decoding worker for relation "stress_victim" (ExceptionalCondition+0x98)[0xaaaad9361698]
/home/srinath/Desktop/pgbuild/lib/postgresql/pgoutput_repack.so(+0xfe8)[0xffff90e00fe8]
postgres: REPACK decoding worker for relation "stress_victim" (+0x679e14)[0xaaaad9049e14]
postgres: REPACK decoding worker for relation "stress_victim" (+0x689cd0)[0xaaaad9059cd0]
postgres: REPACK decoding worker for relation "stress_victim" (+0x68a65c)[0xaaaad905a65c]
postgres: REPACK decoding worker for relation "stress_victim" (+0x68b2f0)[0xaaaad905b2f0]
postgres: REPACK decoding worker for relation "stress_victim" (ReorderBufferCommit+0x74)[0xaaaad905b374]
postgres: REPACK decoding worker for relation "stress_victim" (+0x671ec4)[0xaaaad9041ec4]
postgres: REPACK decoding worker for relation "stress_victim" (xact_decode+0x1a0)[0xaaaad9040edc]
postgres: REPACK decoding worker for relation "stress_victim" (LogicalDecodingProcessRecord+0xd4)[0xaaaad9040a80]
postgres: REPACK decoding worker for relation "stress_victim" (+0x33f558)[0xaaaad8d0f558]
postgres: REPACK decoding worker for relation "stress_victim" (+0x341ccc)[0xaaaad8d11ccc]
postgres: REPACK decoding worker for relation "stress_victim" (RepackWorkerMain+0x1ac)[0xaaaad8d11bd4]
postgres: REPACK decoding worker for relation "stress_victim" (BackgroundWorkerMain+0x2b0)[0xaaaad900d21c]
postgres: REPACK decoding worker for relation "stress_victim" (postmaster_child_launch+0x1f0)[0xaaaad9012070]
postgres: REPACK decoding worker for relation "stress_victim" (+0x64b974)[0xaaaad901b974]
postgres: REPACK decoding worker for relation "stress_victim" (+0x64bc64)[0xaaaad901bc64]
postgres: REPACK decoding worker for relation "stress_victim" (+0x64a3e4)[0xaaaad901a3e4]
postgres: REPACK decoding worker for relation "stress_victim" (+0x647648)[0xaaaad9017648]
postgres: REPACK decoding worker for relation "stress_victim" (PostmasterMain+0x160c)[0xaaaad9016d98]
postgres: REPACK decoding worker for relation "stress_victim" (main+0x3dc)[0xaaaad8ea7a38]
/lib/aarch64-linux-gnu/libc.so.6(+0x284c4)[0xffff9c5c84c4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0xffff9c5c8598]
postgres: REPACK decoding worker for relation "stress_victim" (_start+0x30)[0xaaaad8abc970]
2026-03-16 19:40:21.622 IST [393820] LOG:  background worker "REPACK decoding worker" (PID 397007) was terminated by signal 6: Aborted
2026-03-16 19:40:21.622 IST [393820] LOG:  terminating any other active server processes
2026-03-16 19:40:21.632 IST [397036] FATAL:  the database system is in recovery mode

This crash happens if we run REPACK (concurrently) on a table while a heavy
pgbench workload is concurrently executing multi-table(setup.sql) transactions(dual_chaos.sql).
It triggers after a few back to back REPACK (concurrently) runs.

i think i found the cause for this crash , because there were some changes which
slipped under the nose of the change_useless_for_repack filter , which led some
changes which are not related to the relation which we are currently doing REPACK (concurrently)
got decoded and added into the reorderbuffer queue, the reason for this is repacked_rel_locator.relNumber
is by default set to InvalidOid, this is actually set to the target relation during setup_logical_decoding
but this done after DecodingContextFindStartpoint, in DecodingContextFindStartpoint changes are not
filtered even if its not related to the target relation , because rm_decode->change_useless_for_repack->am_decoding_for_repack
where repacked_rel_locator.relNumber is still InvalidOid, which makes it skip the filtering even its not the target relation,
this makes it to be added to reorder buffer queue, so during the processing of reorder buffer plugin_change is called
where assert fails, i have attached a diff patch to solve this.

thoughts?

--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Improve hash join's handling of tuples with null join keys
Next
From: Bertrand Drouvot
Date:
Subject: Re: Enable -Wstrict-prototypes and -Wold-style-definition by default