Hot standby, race condition between recovery snapshot and commit - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Hot standby, race condition between recovery snapshot and commit
Date
Msg-id 4AFEA9A4.5060808@enterprisedb.com
Whole thread Raw
Responses Re: Hot standby, race condition between recovery snapshot and commit
List pgsql-hackers
There's a race condition between transaction commit and
GetRunningTransactionData(). If GetRunningTransactionData() runs betweenthe RecordTransactionCommit() and
ProcArrayEndTransaction()calls in
 
CommitTransaction():

>     /*
>      * Here is where we really truly commit.
>      */
>     latestXid = RecordTransactionCommit(false);
> 
>     TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
> 
>     /*
>      * Let others know about no transaction in progress by me. Note that this
>      * must be done _before_ releasing locks we hold and _after_
>      * RecordTransactionCommit.
>      */
>     ProcArrayEndTransaction(MyProc, latestXid);

The running-xacts snapshot will include the transaction that's just
committing, but the commit record will be before the running-xacts WAL
record. If standby initializes transaction tracking from that
running-xacts record, it will consider the just-committed transactions
as still in-progress until the next running-xact record (at next
checkpoint).

I can't see any obvious way around that. We could have transaction
commit acquire the new RecoveryInfoLock across those two calls, but I'd
like to avoid putting any extra overhead into such a critical path.

Hmm, actually ProcArrayApplyRecoveryInfo() could check every xid in the
running-xacts record against clog. If it's marked as finished in clog
already (because we already saw the commit/abort record before the
running-xacts record), we know it's not running after all.

Because of the sequence that commit removes entry from procarray and
releases locks, it also seems possible for GetRunningTransactionsData()
to acquire a snapshot that contains an AccessExclusiveLock for a
transaction, but that XID is not listed as running in the XID list. That
sounds like trouble too.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Patch committers
Next
From: Andrew Dunstan
Date:
Subject: Re: UTF8 with BOM support in psql