Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Date
Msg-id 5374CD68.5040304@vmware.com
Whole thread Raw
In response to Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
List pgsql-hackers
On 05/06/2014 02:44 PM, Andres Freund wrote:
> On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote:
>> +/*
>> + * Exit hook to unlock the global transaction entry we're working on.
>> + */
>> +static void
>> +AtProcExit_Twophase(int code, Datum arg)
>> +{
>> +    /* same logic as abort */
>> +    AtAbort_Twophase();
>> +}
>> +
>> +/*
>> + * Abort hook to unlock the global transaction entry we're working on.
>> + */
>> +void
>> +AtAbort_Twophase(void)
>> +{
>> +    if (MyLockedGxact == NULL)
>> +        return;
>> +
>> +    /*
>> +     * If we were in process of preparing the transaction, but haven't
>> +     * written the WAL record yet, remove the global transaction entry.
>> +     * Same if we are in the process of finishing an already-prepared
>> +     * transaction, and fail after having already written the WAL 2nd
>> +     * phase commit or rollback record.
>> +     *
>> +     * After that it's too late to abort, so just unlock the GlobalTransaction
>> +     * entry.  We might not have transfered all locks and other state to the
>> +     * prepared transaction yet, so this is a bit bogus, but it's the best we
>> +     * can do.
>> +     */
>> +    if (!MyLockedGxact->valid)
>> +    {
>> +        RemoveGXact(MyLockedGxact);
>> +    }
>> +    else
>> +    {
>> +        LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
>> +
>> +        MyLockedGxact->locking_backend = InvalidBackendId;
>> +
>> +        LWLockRelease(TwoPhaseStateLock);
>> +    }
>> +    MyLockedGxact = NULL;
>> +}
>
> Is it guaranteed that all paths have called LWLockReleaseAll()
> before calling the proc exit hooks? Otherwise we might end up waiting
> for ourselves...

Hmm. AbortTransaction() will release locks before we get here, but the 
before_shmem_exit() callpath will not. So an elog(FATAL), while holding 
TwoPhaseStateLock would cause us to deadlock with ourself. But there are 
no such elogs.

I copied this design from async.c, which is quite similar, so if there's 
a problem that ought to be fixed too. And there are other more 
complicated before_shmem callbacks that worry me more, like 
createdb_failure_callback(). But I think they're all all right.

>>   /*
>>    * MarkAsPreparing
>> @@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
>>                    errmsg("prepared transactions are disabled"),
>>                 errhint("Set max_prepared_transactions to a nonzero value.")));
>>
>> -    LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
>> -
>> -    /*
>> -     * First, find and recycle any gxacts that failed during prepare. We do
>> -     * this partly to ensure we don't mistakenly say their GIDs are still
>> -     * reserved, and partly so we don't fail on out-of-slots unnecessarily.
>> -     */
>> -    for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
>> +    /* on first call, register the exit hook */
>> +    if (!twophaseExitRegistered)
>>       {
>> -        gxact = TwoPhaseState->prepXacts[i];
>> -        if (!gxact->valid && !TransactionIdIsActive(gxact->locking_xid))
>> -        {
>> -            /* It's dead Jim ... remove from the active array */
>> -            TwoPhaseState->numPrepXacts--;
>> -            TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
>> -            /* and put it back in the freelist */
>> -            gxact->next = TwoPhaseState->freeGXacts;
>> -            TwoPhaseState->freeGXacts = gxact;
>> -            /* Back up index count too, so we don't miss scanning one */
>> -            i--;
>> -        }
>> +        before_shmem_exit(AtProcExit_Twophase, 0);
>> +        twophaseExitRegistered = true;
>>       }
>
> It's not particularly nice to register shmem exit hooks in the middle of
> normal processing because it makes it impossible to use
> cancel_before_shmem_exit() previously registered hooks. I think this
> should be registered at startup, if max_prepared_xacts > 0.

<shrug>. async.c and namespace.c does the same, and it hasn't been a 
problem.

I committed this now, but please let me know if you see a concrete 
problem with the locks.

- Heikki



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: popen and pclose redefinitions causing many warning in Windows build
Next
From: Bruce Momjian
Date:
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers