Re: BUG #18354: Aborted transaction aborted during cleanup when temp_file_limit exceeded - Mailing list pgsql-bugs

From Alex Masterov
Subject Re: BUG #18354: Aborted transaction aborted during cleanup when temp_file_limit exceeded
Date
Msg-id CA+8z=zu-mxXuBtnTVNcOzaGJhgeKSTyeAsK+AdRMr5_XZM1SDw@mail.gmail.com
Whole thread
In response to Re: BUG #18354: Aborted transaction aborted during cleanup when temp_file_limit exceeded  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-bugs

Hi,

 While running tests with Neon, we discovered an assertion failure that can occur during re-entrant AbortTransaction() calls.

 The issue arises when an error occurs during AbortTransaction() after ProcArrayEndTransaction() has cleared MyProc->xid. If another error is raised during cleanup (e.g., in AtEOXact_Inval()), the PostgresMain error handler invokes AbortCurrentTransaction() again. The second AbortTransaction() call reads a still-valid s->transactionId (CleanupTransaction() hasn't run yet) and passes it to ProcArrayEndTransaction(), which then hits:

   Assert(TransactionIdIsValid(proc->xid))

 because MyProc->xid was already cleared by the first call.

 The attached patch fixes this by checking MyProc->xid validity before calling RecordTransactionAbort() and only passing a valid latestXid when appropriate.

 **Reproduction:** This can be reproduced reliably using the injection_points extension:

 1. Attach the injection point:   SELECT injection_points_attach('transaction-end-process-inval', 'error'); 2. Create invalidation messages: CREATE TABLE test(id int); 3. Trigger abort: ROLLBACK;

 Without the fix: assertion crash on ProcArrayEndTransaction() With the fix applied: the script will panic with "ERRORDATA_STACK_SIZE exceeded" due to re-entrant error handling, demonstrating that the assertion is resolved.

 I've included a reproduction script and the fix that clearly shows both behaviors.

 **Files attached:** - 0001-xact-Prevent-assertion-failure-in-re-entrant-Abort.patch - repro_minimal_panic_if_fixed.sh

 Thoughts?

 Best regards, Alexey






Attachment

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #19472: CAST(-32768::SMALLINT AS REAL) fails with "SMALLINT out of range" but -32768 is valid SMALLINT value
Next
From: Laurenz Albe
Date:
Subject: Re: BUG #19472: CAST(-32768::SMALLINT AS REAL) fails with "SMALLINT out of range" but -32768 is valid SMALLINT value