Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Date
Msg-id 769bbdba-f86f-44ad-abad-f6ea4f74e3f2@iki.fi
Whole thread Raw
In response to Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
List pgsql-hackers
On 25/10/2025 21:01, Joel Jacobson wrote:
> On Sat, Oct 25, 2025, at 15:08, Arseniy Mukhin wrote:
>> I reimplemented the test in 0002 as an isolation test and added the
>> commit message. PFA the new version.
> 
> I've rebased the patches so they can be applied on top of v12:
> - v12-vacuum_notify_queue_cleanup-without-code-dup.txt
> - v12-vacuum_notify_queue_cleanup-with-code-dup.txt
> 
> I also want to share a new idea that Heikki Linnakangas came up with
> during PgConf25 in Riga.
> 
> The idea is basically to scan the notification queue from tail to head,
> and update xids that precedes newFrozenXid, to either
> FrozenTransactionId if committed, or InvalidTransactionId if not.
> 
> I've implemented this by adding a new extern function
> AsyncNotifyFreezeXids to async.h, called from vac_update_datfrozenxid.
> 
> This way, we don't need to hold back the advancement of newFrozenXid in
> vac_update_datfrozenxid.
> 
> Attempt of implementing Heikki's idea:
> - async_notify_freeze_xids.txt

Thanks!

> This idea has the benefit of never holding back newFrozenXid,
> however, I see some problems with it:
> 
> - If there is a bug in the code, we could accidentally overwrite
>    xid values we didn't intend to overwrite, and maybe we would never
>    find out that we did.
> 
> - We wouldn't know for sure that we've understood the cause of
>    this bug.
> 
> With v12-vacuum_notify_queue_cleanup, we instead have the downside of
> possibly holding back newFrozenXid, but with the advantage of giving us
> higher confidence in that we've correctly identified the cause of the
> bug.

Meh. Robustness is good and all, and in heap tuples we don't overwrite 
the xmin/xmax but just set a FROZEN flag, precisely so that we preserve 
evidence if something goes wrong. But I can't get too worked up about 
that for the async notification queue.

That said, we could add COMMITTED/INVALID hint bits to AsyncQueueEntry, 
similar to heap tuples, and set the hint bit instead of replacing the 
original xid. That might be good for performance too: If the first 
backend to call TransactionIdDidCommit() on an AsyncQueueEntry would set 
the hint bit, that would save the effort for other listening backends.

- Heikki



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Bug in pg_stat_statements
Next
From: Ranier Vilela
Date:
Subject: Re: Avoid overwiriting cache entry (src/backend/utils/cache/relcache.c)