Re: remove spurious CREATE INDEX CONCURRENTLY wait - Mailing list pgsql-hackers

From Andres Freund
Subject Re: remove spurious CREATE INDEX CONCURRENTLY wait
Date
Msg-id 20201124004056.pabj4vzibbx5bwmp@alap3.anarazel.de
Whole thread Raw
In response to Re: remove spurious CREATE INDEX CONCURRENTLY wait  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: remove spurious CREATE INDEX CONCURRENTLY wait  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: remove spurious CREATE INDEX CONCURRENTLY wait  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi,

On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote:
> ProcSleep:  no bug here.
> The flags are checked to see if we should kill() the process (used when
> autovac blocks some other process).  The lock here appears to be
> inconsequential, since we release it before we do kill(); so strictly
> speaking, there is still a race condition where the autovac process
> could reset the flag after we read it and before we get a chance to
> signal it.  The lock level autovac uses to set the flag is not relevant
> either.

Yea. Even before the recent changes building the lock message under the
lock didn't make sense. Now that the flags are mirrored in pgproc, we
probably should just make this use READ_ONCE(autovac->statusFlags),
without any other use of ProcArrayLock.  As you say the race condition
is between the flag test, the kill, and the signal being processed,
which wasn't previously protected either.


> PROC_IN_LOGICAL_DECODING:
> Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in
> ReplicationSlotRelease might be the most problematic one of the lot.
> That's because a proc's xmin that had been ignored all along by
> ComputeXidHorizons, will now be included in the computation.  Adding
> asserts that proc->xmin and proc->xid are InvalidXid by the time we
> reset the flag, I got hits in pg_basebackup, test_decoding and
> subscription tests.  I think it's OK for ComputeXidHorizons (since it
> just means that a vacuum that reads a later will remove less rows.)  But
> in GetSnapshotData it is just not correct to have the Xmin go backwards.

I don't think there's a problem. PROC_IN_LOGICAL_DECODING can only be
set when outside a transaction block, i.e. walsender. In which case
there shouldn't be an xid/xmin, I think? Or did you gate your assert on
PROC_IN_LOGICAL_DECODING being set?


> GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData:
> 
> In these cases, what we need is that the code computes some xmin (or
> equivalent computation) based on a set of transactions that exclude
> those marked with the flags.  The behavior we want is that if some
> transaction is marked as vacuum, we ignore the Xid/Xmin *if there is
> one*. In other words, if there's no Xid/Xmin, then the flag is not
> important.  So if we can ensure that the flag is set first, and the
> Xid/xmin is installed later, that's sufficient correctness and we don't
> need to hold exclusive lock.

That'd require at least memory barriers in GetSnapshotData()'s loop,
which I'd really like to avoid. Otherwise the order in which memory gets
written in one process doesn't guarantee the order of visibility in
another process...



> In other words, my conclusion is that there definitely is a bug here and
> I am going to restore the use of exclusive lock for setting the
> statusFlags.

Cool.


> GetSnapshotData has an additional difficulty -- we do the
> UINT32_ACCESS_ONCE(ProcGlobal->xid[i]) read *before* testing the bit. 
> So it's not valid to set the bit without locking out GSD, regardless of
> any barriers we had; if we want this to be safe, we'd have to change
> this so that the flag is read first, and we read the xid only
> afterwards, with a read barrier.

Which we don't want, because that'd mean slowing down the *extremely*
common case of the majority of backends neither having an xid assigned
nor doing logical decoding / vacuum. We'd be reading two cachelines
instead of one.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path
Next
From: Greg Nancarrow
Date:
Subject: Re: Parallel plans and "union all" subquery