Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Arseniy Mukhin
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id CAE7r3M+LfTsGDYmUqFWq1DwWyqGC4syWFHT5BZjJA2+w2bEtgw@mail.gmail.com
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>)
Responses Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
List pgsql-hackers
On Wed, Oct 22, 2025 at 3:02 AM Joel Jacobson <joel@compiler.org> wrote:
>
> On Wed, Oct 22, 2025, at 02:16, Matheus Alcantara wrote:
> >> Regarding the v8 patch, it introduces a fundamentally new way of
> >> managing notification entries (adding entries with 'committed' state
> >> and marking them 'aborted' in abort paths). This affects all use
> >> cases, not just those involving very old unconsumed notifications, and
> >> could introduce more serious bugs like PANIC or SEGV. For
> >> backpatching, I prefer targeting just the problematic behavior while
> >> leaving unrelated parts unchanged. Though Álvaro might have a
> >> different perspective on this.
> >>
> > Thanks very much for this explanation and for what you've previously
> > wrote on [1]. It's clear to me now that the v8 architecture is not a
> > good way to go.
>
> How about doing some more work in vac_update_datfrozenxid()?
>
> Pseudo-code sketch:
>
> ```
> void
> vac_update_datfrozenxid(void)
> {
>
>     /* After computing newFrozenXid from all known sources... */
>
>     TransactionId oldestNotifyXid = GetOldestQueuedNotifyXid();
>
>     if (TransactionIdIsValid(oldestNotifyXid) &&
>         TransactionIdPrecedes(oldestNotifyXid, newFrozenXid))
>     {
>         /*
>          * The async queue has XIDs older than our proposed freeze point.
>          * Attempt cleanup, then back off and let the next VACUUM benefit.
>          */
>
>         if (asyncQueueHasListeners())
>         {
>             /*
>              * Wake all listening backends across *all* databases
>              * that are not already at QUEUE_HEAD.
>              * They'll hopefully process notifications and advance
>              * their pointers, allowing the next VACUUM to freeze further.
>              */
>             asyncQueueWakeAllListeners();
>         }
>         else
>         {
>             /*
>              * No listeners exist - discard all unread notifications.
>              * The next VACUUM should succeed in advancing datfrozenxid.
>              * asyncQueueAdvanceTailNoListeners() would take exclusive lock
>              * on NotifyQueueLock before checking
>              * QUEUE_FIRST_LISTENER == INVALID_PROC_NUMBER
>              */
>             asyncQueueAdvanceTailNoListeners();
>         }
>
>         /*
>          * Back off datfrozenxid to protect the old XIDs.
>          * The cleanup we just performed should allow the next VACUUM
>          * to freeze further.
>          */
>         newFrozenXid = oldestNotifyXid;
>     }
> }
> ```
>
> Maybe it wouldn't solve all problematic situations, but to me it seems
> like these measures could help many of them, or am I missing some
> crucial insight here?

I agree we need to add something like this. Looks like with v10 it's
possible for the listen/notify queue to block datfrozenxid advancing
even without extreme circumstances (without hanging listeners etc).

I see two thing we should take care of in v10:

1) Currently asyncQueueAdvanceTail is called regularly only if we have
a constant flow of notifications. We try to advance the tail every
time when the head reaches every 4th page. So if we don't have new
notifications, the tail will stay where it is forever. It means that
if we have at least 1 notification in the queue without constant flow
of notifications then GetOldestQueuedNotifyXid will constantly return
the same result which will block the advancement of datfrozenxid. So
it looks like we need something that will advance the tail in this
case.

2) Currently we wake up all listeners regularly only if we have a
constant flow of notifications (again). Even if we have a listener and
we never wake up such a listener because of the new notifications (for
example there are no new notifications in the listener's database), we
still signal such a listener sometimes as its position lags too much
from the head. But again, if we don't have new notifications, it's
possible that such a listener will never process some notification
(from another database) and tail advancement will be blocked. As a
result, datfrozenxid advancement also will be blocked. So probably we
need something that will wake up lagging listeners if they block tail
advancement.


Best regards,
Arseniy Mukhin



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Making pg_rewind faster
Next
From: shveta malik
Date:
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance