Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Arseniy Mukhin
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id CAE7r3MKaoTeNuSdzJCMpgEC9iVHnFXhswyDW3PtbYt5sKjXBOA@mail.gmail.com
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  ("Matheus Alcantara" <matheusssilv97@gmail.com>)
Responses Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
List pgsql-hackers
On Mon, Sep 22, 2025 at 4:09 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> On Fri Sep 19, 2025 at 12:13 PM -03, Arseniy Mukhin wrote:
> > I think it's impossible: before pushing anything to the queue we
> > acquire global lock in PreCommit_Notify():
> >
> > LockSharedObject(DatabaseRelationId, InvalidOid, 0, AccessExclusiveLock);
> >
> > While we are holding the lock, no writers can add anything to the
> > queue. Then we save head position and add pending notifications to the
> > queue. The moment we get in AtAbort_Notify(), we still hold the global
> > lock (maybe it is worth adding Assert about it if we start relying on
> > it), so we can be sure there are no notifications in the queue after
> > the saved head position except ours. So it seems safe but maybe I
> > missed something.
> >
> Thanks for the explanation! I'm just not sure if I understand why do we
> need the LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE) on
> PreCommit_Notify() if we already have the
> LockSharedObject(DatabaseRelationId, InvalidOid, 0, AccessExclusiveLock);
>

Good question. It seems that it would be enough to hold
NotifyQueueLock only during the head update, so I don't understand it
either.

> See the attached patch that is based on your the previous comment of
> resetting the QUEUE_HEAD at AtAbort_Notify()

Thank you for the new version!

The fix looks exactly how I thought about it. But while thinking about
why we need to hold NotifyQueueLock in PreCommit_Notify I realized
there is a concurrency bug in the 'head resetting' approach. I thought
that listeners hold NotifyQueueLock lock while reading notifications
from the queue, but now I see that they hold it only while reading the
current head position. So we can end up in the next situation:

There are 2 backends:  listener and writer (backend with notifications)


        listener                                   writer
  ----------------------------------------------------------------------
                                         writer wants to commit, so it
                                         adds notifications to the queue
                                         in PreCommit_Notify() and advances
                                         queue head.

  ----------------------------------------------------------------------
listener wants to read notifications. It
gets into asyncQueueReadAllNotifications()
and reads the current head (it already
contains writer's notifications):


asyncQueueReadAllNotifications()
          ...
    LWLockAcquire(NotifyQueueLock, LW_SHARED);
    head = QUEUE_HEAD;
    LWLockRelease(NotifyQueueLock);
          ...

  ----------------------------------------------------------------------
                                         writer failed to commit, so it
                                         resets queue head in AtAbort_Notify()
                                         and completes the transaction.

  ----------------------------------------------------------------------
listener gets a snapshot where the writer
is not in progress.

        ...
    snapshot = RegisterSnapshot(GetLatestSnapshot());
        ...


This way the listener reads the head that includes all writer's
notifications and a snapshot where the writer is not in progress, so
nothing stops the listener from sending these notifications and it's
even possible to have the listener's position that is after the queue
head, so yes, it's bad :( Sorry about that.

Probably we can fix it (swap GetLatestSnapshot() with the head
position reading for example) but now I think that 'committed' field
approach is a more robust way to fix the bug. What do you think?

BTW one thing about 'committed' field fix version [0]. It seems that
instead of remembering all notification positions, we can remember the
head position after we acquire global lock and before we write
anything to the queue and in case of abort we can just start from the
saved position and mark all notifications until the end of the queue
as 'committed = false'. The reason is the same as with resetting the
head approach - as we hold global lock, no writers can add
notifications to the queue, so when we in AtAbort_Notify() all
notifications after the saved position are ours.


[0] https://www.postgresql.org/message-id/CAFY6G8fui7omKfUXF%2BJJ82B34ExrwK6J7vKe61S-DhEmr_jBhA%40mail.gmail.com


Best regards,
Arseniy Mukhin



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Nathan Bossart
Date:
Subject: Re: Fix overflow of nbatch