Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Arseniy Mukhin
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id CAE7r3M+Q4xjpU-gK-QtVOG32Nk=0-tT71fv7O-XorUXkhC7CAQ@mail.gmail.com
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  ("Matheus Alcantara" <matheusssilv97@gmail.com>)
List pgsql-hackers
On Wed, Sep 24, 2025 at 1:40 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
>
> On Tue Sep 23, 2025 at 1:11 PM -03, Arseniy Mukhin wrote:
> >> Thanks for the explanation! I'm just not sure if I understand why do we
> >> need the LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE) on
> >> PreCommit_Notify() if we already have the
> >> LockSharedObject(DatabaseRelationId, InvalidOid, 0, AccessExclusiveLock);
> >>
> >
> > Good question. It seems that it would be enough to hold
> > NotifyQueueLock only during the head update, so I don't understand it
> > either.
> >
> IIUC correctly we acquire the LockSharedObject(DatabaseRelationId,
> InvalidOid, 0, AccessExclusiveLock) to make other COMMIT's wait and the
> LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE) to make listener backends
> wait while we are adding the entries on the queue.

Yeah, but I don't get why we want to block them with NotifyQueueLock
while we are writing to the queue. There is a SLRU bank lock and they
can't read anything after the head. Maybe it's just not a big deal?

> ...
> > This way the listener reads the head that includes all writer's
> > notifications and a snapshot where the writer is not in progress, so
> > nothing stops the listener from sending these notifications and it's
> > even possible to have the listener's position that is after the queue
> > head, so yes, it's bad :( Sorry about that.
> >
> Yeah, this is bad. I'm wondering if we could reproduce such race
> conditions scenarios with some TAP tests.
>

I agree it would be great to have more tests for such cases. As for
the 'committed field' patch, I think we can add a TAP test that shows
that listeners postpone processing of notifications until
notifications were marked as 'committed=false' in case of aborted
transactions. I tried to write one, but have not succeeded yet. Hope
to finish it soon.

> > Probably we can fix it (swap GetLatestSnapshot() with the head
> > position reading for example) but now I think that 'committed' field
> > approach is a more robust way to fix the bug. What do you think?
> >
> I also agree that the committed field seems a more safe approach.
>
> > BTW one thing about 'committed' field fix version [0]. It seems that
> > instead of remembering all notification positions, we can remember the
> > head position after we acquire global lock and before we write
> > anything to the queue and in case of abort we can just start from the
> > saved position and mark all notifications until the end of the queue
> > as 'committed = false'. The reason is the same as with resetting the
> > head approach - as we hold global lock, no writers can add
> > notifications to the queue, so when we in AtAbort_Notify() all
> > notifications after the saved position are ours.
> >
> See the new attached version which implements this idea of using the
> committed field approach. I was just a bit concenerd about a race
> condition situation where the QUEUE_HEAD is changed by another publisher
> process and just iterating over all entries from the saved previous
> QUEUE_HEAD position until the current QUEUE_HEAD position we could mark
> successfully committed notifications as not committed by mistake, so in
> this new patch version I save the QUEUE_HEAD position before we add the
> entries on the shared queue and also the QUEUE_HEAD position after these
> entries are added so we ensure that we only process the entries of this
> range although we have the global lock
> LockSharedObject(DatabaseRelationId) that may prevent this situation.

Thank you! Speaking of the scenario, my understanding is that it's
impossible as we hold the global lock, so QueuePosition head should
always be equal to QUEUE_HEAD when we get into At_AbortNotify(), but
maybe I'm wrong.

Patch looks great. Some minor points:

I have a warning when using git am with the patch:
     warning: 1 line adds whitespace errors.

There is a comment in the head of the async.c file about some
listen/notify internals. Maybe it's worth adding a comment about how
aborted transactions do clean up.

What do you think about a Assert in asyncQueueRollbackNotifications()
that other backends still see us as 'in progress'? So we can be sure
that they can't process our notifications before we mark notifications
as 'committed=false'. Not sure how to do it correctly, maybe

          Assert(TransactionIdIsValid(MyProc->xid));

will work? The TAP test that I tried to write also should test it.

There are several comments where the word "crash" is used. What do you
think about using "abort" instead? "Crash" sounds more like PANIC
situation where we don't care about notifications because they don't
survive restart.

Thank you!


Best regards,
Arseniy Mukhin



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: "openssl" should not be optional
Next
From: Tom Lane
Date:
Subject: Re: GNU/Hurd portability patches