Re: Problem with txid_snapshot_in/out() functionality - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Problem with txid_snapshot_in/out() functionality
Date
Msg-id 534BA732.20203@vmware.com
Whole thread Raw
In response to Re: Problem with txid_snapshot_in/out() functionality  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Problem with txid_snapshot_in/out() functionality  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 04/12/2014 05:03 PM, Andres Freund wrote:
> On 2014-04-12 09:47:24 -0400, Tom Lane wrote:
>> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
>>> On 04/12/2014 12:07 AM, Jan Wieck wrote:
>>>> the Slony team has been getting seldom reports of a problem with the
>>>> txid_snapshot data type.
>>>> The symptom is that a txid_snapshot on output lists the same txid
>>>> multiple times in the xip list part of the external representation.
>>
>>> It's two-phase commit. When preparing a transaction, the state of the
>>> transaction is first transfered to a dummy PGXACT entry, and then the
>>> PGXACT entry of the backend is cleared. There is a transient state when
>>> both PGXACT entries have the same xid.
>>
>> Hm, yeah, but why is that intermediate state visible to anyone else?
>> Don't we have exclusive lock on the PGPROC array while we're doing this?
>
> It's done outside the remit of ProcArray lock :(. And documented to lead
> to duplicate xids in PGXACT.
> EndPrepare():
>     /*
>      * Mark the prepared transaction as valid.    As soon as xact.c marks
>      * MyPgXact as not running our XID (which it will do immediately after
>      * this function returns), others can commit/rollback the xact.
>      *
>      * NB: a side effect of this is to make a dummy ProcArray entry for the
>      * prepared XID.  This must happen before we clear the XID from MyPgXact,
>      * else there is a window where the XID is not running according to
>      * TransactionIdIsInProgress, and onlookers would be entitled to assume
>      * the xact crashed.  Instead we have a window where the same XID appears
>      * twice in ProcArray, which is OK.
>      */
>     MarkAsPrepared(gxact);
>
> It doesn't sound too hard to essentially move PrepareTransaction()'s
> ProcArrayClearTransaction() into MarkAsPrepared() and rejigger the
> locking to remove the intermediate state. But I think it'll lead to
> bigger changes than we'd be comfortable backpatching.

Hmm. There's a field in GlobalTransactionData called locking_xid, which 
is used to mark the XID of the transaction that's currently operating on 
the prepared transaction. At prepare, that ensures that the transaction 
cannot be committed or rolled back by another backend until the original 
backend has cleared its PGPROC entry. At COMMIT/ROLLBACK PREPARED, it 
ensures that only one backend can commit/rollback the transaction.

I wonder why we don't use a VirtualTransactionId there. AFAICS there is 
no reason for COMMIT/ROLLBACK PREPARED to be assigned an XID of its own. 
And if we used a VirtualTransactionId there, prepare could clear the xid 
field of the PGPROC entry earlier.

- Heikki



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Minor improvements in create_foreign_table.sgml
Next
From: Andres Freund
Date:
Subject: Re: Problem with txid_snapshot_in/out() functionality