Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id 6ae4acd3-6c57-4271-bc24-634fd1b11d67@iki.fi
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
List pgsql-hackers
On 29/10/2025 18:48, Tom Lane wrote:
> However ... that won't actually work, the reason being that
> asyncQueueProcessPageEntries() doesn't work directly from an SLRU page
> but from a local copy.  Even if it were to modify the state of that
> copy, no other backend would see the effects.

Ah, I missed that local copy.

> The reason it's like that is stated in the comments:
> 
>   * The current page must have been fetched into page_buffer from shared
>   * memory.  (We could access the page right in shared memory, but that
>   * would imply holding the SLRU bank lock throughout this routine.)
> 
> The patch proposed here likewise appears to involve holding an SLRU
> bank lock throughout what could be a significant number of
> TransactionIdDidCommit tests.  That seems like it could result in a
> pretty bad "burp" in NOTIFY throughput.  That problem is ameliorated
> by only doing it when VACUUM is trying to advance datfrozenxid, but
> still I wonder if we can't find a less concurrency-unfriendly answer.

Hmm. I thought the original comment was more worried about sending the 
message to the client being slow. But I guess the 
TransactionIdDidCommit() calls are not free either.

> The local-copy behavior also means that this patch isn't quite a 100%
> fix.  We could have a race condition like so:
> 
> 1. Backend A grabs a copy of some SLRU page and begins running
> asyncQueueProcessPageEntries(), but then loses the CPU.
> 
> 2. Backend B completes a VACUUM, finds it can advance datfrozenxid,
> marks some relevant XIDs as frozen in the notify SLRU, and truncates
> clog.
> 
> 3. Backend A gets control back, tries to discover the state of
> some XID that's still present in its local copy of the XID page,
> and fails.
> 
> Step 2 will take long enough that this isn't very plausible
> timing-wise, but it's still theoretically a hole.

It would be better than what we have now, but sure would be nice to fix 
the problem fully..

A straightforward fix would be to introduce another lock, 
NotifyQueueVacuumLock, that's held in shared mode across 
asyncQueueReadAllNotifications(), and acquired in exclusive mode by 
freezing. That way freezing would wait out the concurrent readers that 
might have local copies. Or some other similar mechanism for waiting 
them out with e.g. condition variables.

> All of this is a problem mainly because of the presumption that
> holding an SLRU bank lock for a long time is bad.  I wonder how
> dangerous that really is.

It was worse in older versions before the SLRU banks were introduced.

Holding the lock while we send the notification to the client is not 
acceptable. But perhaps we could split asyncQueueProcessPageEntries() 
into two parts:

1. Do the TransactionIdDidCommit() checks on each entry and copy the 
visible entries to local memory.
2. Release lock and send the notifications to the backend.

Joel, since you've been working on some optimizations in this area too, 
would you happen to have some suitable performance test scripts for this?

- Heikki




pgsql-hackers by date:

Previous
From: Daniil Davydov
Date:
Subject: Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Next
From: Joe Conway
Date:
Subject: Re: contrib/sepgsql regression tests have been broken for months