Thread: pgsql: Fix race condition in snapshot caching when 2PC is used.

pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Andres Freund
Date:
Fix race condition in snapshot caching when 2PC is used.

When preparing a transaction xactCompletionCount needs to be
incremented, even though the transaction has not committed
yet. Otherwise the snapshot used within the transaction otherwise can
get reused outside of the prepared transaction. As GetSnapshotData()
does not include the current xid when building a snapshot, reuse would
not be correct.

Somewhat surprisingly the regression tests only rarely show incorrect
results without the fix. The reason for that is that often the
snapshot's xmax will be >= the backend xid, yielding a snapshot that
is correct, despite the bug.

I'm working on a reliable test for the bug, but it seems worth seeing
whether this fixes all the BF failures while I do.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/E1k7tGP-0005V0-5k@gemulon.postgresql.org

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/07f32fcd23ac81898ed47f88beb569c631a2f223

Modified Files
--------------
src/backend/storage/ipc/procarray.c | 9 +++++++++
1 file changed, 9 insertions(+)


Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Fix race condition in snapshot caching when 2PC is used.

Uh, is it really okay to modify that variable with only shared
ProcArrayLock?

            regards, tom lane



Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Andres Freund
Date:
Hi,

On 2020-08-18 19:43:24 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Fix race condition in snapshot caching when 2PC is used.
> 
> Uh, is it really okay to modify that variable with only shared
> ProcArrayLock?

Uh, you're right.  I assume it'll fix the buildfarm visible issue
regardless, but we surely don't want to rely on that.

I'm inclined to just make ClearTransaction take an exclusive lock - the
rest of the 2PC operations are so heavyweight that I can't imagine
making a difference.  When I tested the locking changes in
ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
visible.

Greetings,

Andres Freund



Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-08-18 19:43:24 -0400, Tom Lane wrote:
>> Uh, is it really okay to modify that variable with only shared
>> ProcArrayLock?

> Uh, you're right.  I assume it'll fix the buildfarm visible issue
> regardless, but we surely don't want to rely on that.

Yeah, having a failure show on the buildfarm would take an order of
magnitude (at least) tighter timing than what we're seeing now.
But I think it could possibly be a problem on big production iron.

> I'm inclined to just make ClearTransaction take an exclusive lock - the
> rest of the 2PC operations are so heavyweight that I can't imagine
> making a difference.  When I tested the locking changes in
> ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
> visible.

I was wondering if it'd be sensible to convert that counter into an
atomic variable.  That's not real clear, but worth thinking about.

            regards, tom lane



Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Andres Freund
Date:
Hi,

On 2020-08-18 19:55:50 -0400, Tom Lane wrote:
> > I'm inclined to just make ClearTransaction take an exclusive lock - the
> > rest of the 2PC operations are so heavyweight that I can't imagine
> > making a difference.  When I tested the locking changes in
> > ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
> > visible.
> 
> I was wondering if it'd be sensible to convert that counter into an
> atomic variable.  That's not real clear, but worth thinking about.

I had it that way originally, but because xactCompletionCount otherwise
needs to be modified with ProcArrayLock exclusively held, I moved it to
a plain variable. It'd be a bad deal to actually modify it atomically
unless we got something out of that ([1]), because that'd increase the
number of atomic operations for commits, something we surely don't want.

We could make the locking rules be that modifications with a shared
ProcArrayLock have to be done with an atomic op, and operations with
ProcArrayLock held exclusively may do it without an atomic op. But that
seems a bit weird.

A slightly bigger hammer that we could use would be to remove the
somewhat weird split of 2PC commit between ProcArrayClearTransaction()
and ProcArrayAdd(). As far as I can tell we could have
EndPrepare()/MarkAsPrepared() do both operations with one exclusive
ProcArrayLock acquisition.

On balance I'd change it to an exclusive lock with a comment mentioning
that it'd not be too hard to downgrade to a shared lock, should it ever
become a performance issue.

Greetings,

Andres Freund


[1] e.g. not needing to hold ProcArrayLock to check whether a cached
snapshot can be used. The complexity there is that I think it adds a
"race condition" where the global xmin horizon can go backward. It'd be
detected and "reverted", but that still could make the horizon appear to
go backwar for a concurrent horizon determination.




Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
David Rowley
Date:
On Wed, 19 Aug 2020 at 12:37, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-08-18 19:55:50 -0400, Tom Lane wrote:
> > > I'm inclined to just make ClearTransaction take an exclusive lock - the
> > > rest of the 2PC operations are so heavyweight that I can't imagine
> > > making a difference.  When I tested the locking changes in
> > > ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
> > > visible.
> >
> > I was wondering if it'd be sensible to convert that counter into an
> > atomic variable.  That's not real clear, but worth thinking about.
>
> I had it that way originally, but because xactCompletionCount otherwise
> needs to be modified with ProcArrayLock exclusively held, I moved it to
> a plain variable. It'd be a bad deal to actually modify it atomically
> unless we got something out of that ([1]), because that'd increase the
> number of atomic operations for commits, something we surely don't want.
>
> We could make the locking rules be that modifications with a shared
> ProcArrayLock have to be done with an atomic op, and operations with
> ProcArrayLock held exclusively may do it without an atomic op. But that
> seems a bit weird.

Couldn't it be done by creating two inline functions, one to call to
atomically increment and the other to just increment?  Can backup that
the correct version of the function is being called with an
Assert(LWLockHeldByMeInMode(ProcArrayLock, ...));

The weirdness of doing var++ on an atomic variable can be removed by
mentioning why it's ok in a comment in the inline function.

David



Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
>> On 2020-08-18 19:55:50 -0400, Tom Lane wrote:
>>> On Wed, 19 Aug 2020 at 12:37, Andres Freund <andres@anarazel.de> wrote:
>>>> I'm inclined to just make ClearTransaction take an exclusive lock - the
>>>> rest of the 2PC operations are so heavyweight that I can't imagine
>>>> making a difference.  When I tested the locking changes in
>>>> ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
>>>> visible.

>>> I was wondering if it'd be sensible to convert that counter into an
>>> atomic variable.  That's not real clear, but worth thinking about.

> Couldn't it be done by creating two inline functions, one to call to
> atomically increment and the other to just increment?  Can backup that
> the correct version of the function is being called with an
> Assert(LWLockHeldByMeInMode(ProcArrayLock, ...));

On reflection I agree with Andres' thought that just taking the lock
exclusively in ProcArrayClearTransaction is the right solution.
It's silly to imagine that a 2PC commit (plus all the other stuff that
needs to happen around that) is fast enough that that'll be a serious
performance hit.  Keeping things simple for the other code paths is
a more useful goal.

            regards, tom lane