Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) |
Date | |
Msg-id | 5374CD68.5040304@vmware.com Whole thread Raw |
In response to | Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED
(was Re: Problem with txid_snapshot_in/out() functionality)
|
List | pgsql-hackers |
On 05/06/2014 02:44 PM, Andres Freund wrote: > On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote: >> +/* >> + * Exit hook to unlock the global transaction entry we're working on. >> + */ >> +static void >> +AtProcExit_Twophase(int code, Datum arg) >> +{ >> + /* same logic as abort */ >> + AtAbort_Twophase(); >> +} >> + >> +/* >> + * Abort hook to unlock the global transaction entry we're working on. >> + */ >> +void >> +AtAbort_Twophase(void) >> +{ >> + if (MyLockedGxact == NULL) >> + return; >> + >> + /* >> + * If we were in process of preparing the transaction, but haven't >> + * written the WAL record yet, remove the global transaction entry. >> + * Same if we are in the process of finishing an already-prepared >> + * transaction, and fail after having already written the WAL 2nd >> + * phase commit or rollback record. >> + * >> + * After that it's too late to abort, so just unlock the GlobalTransaction >> + * entry. We might not have transfered all locks and other state to the >> + * prepared transaction yet, so this is a bit bogus, but it's the best we >> + * can do. >> + */ >> + if (!MyLockedGxact->valid) >> + { >> + RemoveGXact(MyLockedGxact); >> + } >> + else >> + { >> + LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE); >> + >> + MyLockedGxact->locking_backend = InvalidBackendId; >> + >> + LWLockRelease(TwoPhaseStateLock); >> + } >> + MyLockedGxact = NULL; >> +} > > Is it guaranteed that all paths have called LWLockReleaseAll() > before calling the proc exit hooks? Otherwise we might end up waiting > for ourselves... Hmm. AbortTransaction() will release locks before we get here, but the before_shmem_exit() callpath will not. So an elog(FATAL), while holding TwoPhaseStateLock would cause us to deadlock with ourself. But there are no such elogs. I copied this design from async.c, which is quite similar, so if there's a problem that ought to be fixed too. And there are other more complicated before_shmem callbacks that worry me more, like createdb_failure_callback(). But I think they're all all right. >> /* >> * MarkAsPreparing >> @@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid, >> errmsg("prepared transactions are disabled"), >> errhint("Set max_prepared_transactions to a nonzero value."))); >> >> - LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE); >> - >> - /* >> - * First, find and recycle any gxacts that failed during prepare. We do >> - * this partly to ensure we don't mistakenly say their GIDs are still >> - * reserved, and partly so we don't fail on out-of-slots unnecessarily. >> - */ >> - for (i = 0; i < TwoPhaseState->numPrepXacts; i++) >> + /* on first call, register the exit hook */ >> + if (!twophaseExitRegistered) >> { >> - gxact = TwoPhaseState->prepXacts[i]; >> - if (!gxact->valid && !TransactionIdIsActive(gxact->locking_xid)) >> - { >> - /* It's dead Jim ... remove from the active array */ >> - TwoPhaseState->numPrepXacts--; >> - TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts]; >> - /* and put it back in the freelist */ >> - gxact->next = TwoPhaseState->freeGXacts; >> - TwoPhaseState->freeGXacts = gxact; >> - /* Back up index count too, so we don't miss scanning one */ >> - i--; >> - } >> + before_shmem_exit(AtProcExit_Twophase, 0); >> + twophaseExitRegistered = true; >> } > > It's not particularly nice to register shmem exit hooks in the middle of > normal processing because it makes it impossible to use > cancel_before_shmem_exit() previously registered hooks. I think this > should be registered at startup, if max_prepared_xacts > 0. <shrug>. async.c and namespace.c does the same, and it hasn't been a problem. I committed this now, but please let me know if you see a concrete problem with the locks. - Heikki
pgsql-hackers by date: