Re: pgsql: Fix race condition in snapshot caching when 2PC is used. - Mailing list pgsql-committers

From Andres Freund
Subject Re: pgsql: Fix race condition in snapshot caching when 2PC is used.
Date
Msg-id 20200819003713.cm2dg7o7knqnxdny@alap3.anarazel.de
Whole thread Raw
In response to Re: pgsql: Fix race condition in snapshot caching when 2PC is used.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql: Fix race condition in snapshot caching when 2PC is used.  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-committers
Hi,

On 2020-08-18 19:55:50 -0400, Tom Lane wrote:
> > I'm inclined to just make ClearTransaction take an exclusive lock - the
> > rest of the 2PC operations are so heavyweight that I can't imagine
> > making a difference.  When I tested the locking changes in
> > ProcArrayAdd()/Remove() the more heavyweight locking wasn't at all
> > visible.
> 
> I was wondering if it'd be sensible to convert that counter into an
> atomic variable.  That's not real clear, but worth thinking about.

I had it that way originally, but because xactCompletionCount otherwise
needs to be modified with ProcArrayLock exclusively held, I moved it to
a plain variable. It'd be a bad deal to actually modify it atomically
unless we got something out of that ([1]), because that'd increase the
number of atomic operations for commits, something we surely don't want.

We could make the locking rules be that modifications with a shared
ProcArrayLock have to be done with an atomic op, and operations with
ProcArrayLock held exclusively may do it without an atomic op. But that
seems a bit weird.

A slightly bigger hammer that we could use would be to remove the
somewhat weird split of 2PC commit between ProcArrayClearTransaction()
and ProcArrayAdd(). As far as I can tell we could have
EndPrepare()/MarkAsPrepared() do both operations with one exclusive
ProcArrayLock acquisition.

On balance I'd change it to an exclusive lock with a comment mentioning
that it'd not be too hard to downgrade to a shared lock, should it ever
become a performance issue.

Greetings,

Andres Freund


[1] e.g. not needing to hold ProcArrayLock to check whether a cached
snapshot can be used. The complexity there is that I think it adds a
"race condition" where the global xmin horizon can go backward. It'd be
detected and "reverted", but that still could make the horizon appear to
go backwar for a concurrent horizon determination.




pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Fix race condition in snapshot caching when 2PC is used.
Next
From: David Rowley
Date:
Subject: Re: pgsql: Fix race condition in snapshot caching when 2PC is used.