Re: Prepared transaction releasing locks before deregistering its GID - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Prepared transaction releasing locks before deregistering its GID
Date
Msg-id 20190221055431.GO15532@paquier.xyz
Whole thread Raw
In response to Re: Prepared transaction releasing locks before deregistering its GID  (Oleksii Kliukin <alexk@hintbits.com>)
Responses Re: Prepared transaction releasing locks before deregistering its GID  (Oleksii Kliukin <alexk@hintbits.com>)
List pgsql-hackers
On Wed, Feb 20, 2019 at 11:50:42AM +0100, Oleksii Kliukin wrote:
> RecoverPreparedTransactions calls ProcessRecords with
> twophase_recover_callbacks right after releasing the TwoPhaseStateLock, so I
> think lock_held should be false here (matching the similar call of
> TwoPhaseGetDummyProc at lock_twophase_recover).

Indeed.  That would be a bad idea.  We don't actually stress this
routine in 009_twophase.pl or the assertion I added would have blown
up immediately.  So I propose to close the gap, and the updated patch
attached adds dedicated tests causing the problem you spotted to be
triggered (soft and hard restarts with 2PC transactions including
DDLs).  The previous version of the patch and those tests cause the
assertion to blow up, failing the tests.

> Since the patch touches TwoPhaseGetDummyBackendId, let’s fix the comment
> header to mention it instead of TwoPhaseGetDummyProc

Not sure I understand the issue you are pointing out here.  The patch
adds an extra argument to both TwoPhaseGetDummyProc() and
TwoPhaseGetDummyBackendId(), and the headers of both functions
document the new argument.

One extra trick I have been using for testing this patch is the
following:
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -816,13 +816,6 @@ TwoPhaseGetGXact(TransactionId xid, bool lock_held)

    Assert(!lock_held || LWLockHeldByMe(TwoPhaseStateLock));

-   /*
-    * During a recovery, COMMIT PREPARED, or ABORT PREPARED, we'll be called
-    * repeatedly for the same XID.  We can save work with a simple cache.
-    */
-   if (xid == cached_xid)
-       return cached_gxact;

This has the advantage to check more aggressively for lock conflicts,
causing the tests to deadlock if the flag is not correctly set in the
worst case scenario.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: "Ideriha, Takeshi"
Date:
Subject: RE: Protect syscache from bloating with negative cache entries
Next
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Timeout parameters