Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |
Date | |
Msg-id | CAD21AoCFZxXCBy+5DoarfG9LC9VdNwWRDpDHE5sdTh5Ym0EcqQ@mail.gmail.com Whole thread Raw |
In response to | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue ("Matheus Alcantara" <matheusssilv97@gmail.com>) |
Responses |
Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
|
List | pgsql-hackers |
On Mon, Oct 20, 2025 at 5:37 AM Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > On Sat Oct 18, 2025 at 2:43 AM -03, Joel Jacobson wrote: > > On Fri, Oct 17, 2025, at 22:50, Arseniy Mukhin wrote: > > What a funny coincidence that the approach in this patch, > > has one similarity with the "Direct advancement" approach > > in the patch in the "Optimize LISTEN/NOTIFY" [1] thread, > > namely that we're both interested in QUEUE_HEAD before/after > > we push the notifications into the queue, in PreCommit_Notify(). > > > > It looks to me like our new data structures are Interchangeable, > > so I guess we probably want both patches to eventually settle > > on one and the same? > > > > The differences I note between our queue head before/after code are: > > > > - In this patch, you are palloc'ing a struct with two fields. > > In [1], we're using two separate static QueuePosition variables. > > > > - In this patch, you are taking/releasing a shared lock before/after > > the loop to read QUEUE_HEAD and set previousHead/head. > > In [1], we avoid the need of the shared lock, by doing the reads > > within the existing exclusive lock inside the loop, but instead > > therefore need a firstIteration bool, to know which is the first > > iteration, and need to overwrite the after-var in each iteration. > > > > I don't think the noted differences above matter, both seems fine. > > > Yeah, I also think that both approach seems fine. I keep the v8 version > with the palloc, if someone has any concern about this I'm open to > switch to another approach. > > > Another thing I noticed in your patch that made me wonder, > > is the naming of the new AsyncQueueEntry bool field, > > which is given the name "committed". > > > > I think this name is not entirely faithful, since when set to true, > > the entry has not been committed yet. > > > > How about negating the meaning of this boolean field? > > To instead indicate when the entry has been rollbacked. > > Then, it would clearly communicate just that. > > > > Maybe naming it something like "rollbacked" or "aborted"? > > > Good point. I've renamed this field on the attached v8 version. I've reviewed the v8 patch and I'm not sure it's a bullet-proof approach. The basic idea of the v8 patch is to add async entries with rollbacked=false and we set rollbacked=true in asyncQueueRollbackNotifications called from AtAbort_Notify() if an error happens between PreCommit_Notify() and AtCommit_Notify(). The asyncQueueRollbackNotifications() reads SLRU pages to mark entries 'rollbacked' but reading SLRU pages could fail for some reason. In this case, we would end up with a PANIC for recursive errors, or even if we skip marking entries we would end up leaving entries as committed. Also, if there are a lot of notifications across multiple SLRU pages we need to mark as 'rollbacked', asyncQueueRollbackNotifications() could take time to complete but it's not interruptible. I don't think it's a good idea to introduce such an operation in abort paths. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: