Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers
From | Joel Jacobson |
---|---|
Subject | Re: Optimize LISTEN/NOTIFY |
Date | |
Msg-id | 82ad6bc2-2520-4ff9-92b9-1f5e2f4ae78e@app.fastmail.com Whole thread Raw |
In response to | Re: Optimize LISTEN/NOTIFY (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>) |
List | pgsql-hackers |
On Wed, Oct 15, 2025, at 13:19, Arseniy Mukhin wrote: > I tried the patch and it seems listeners sometimes don't receive > notifications. To reproduce it you can try to listen to the channel in > one psql session and send notifications from another psql session. But > all tests are fine, so I tried to write a TAP test to reproduce it. It > passes on master and fails with the patch, so looks like it's real. > Please find the repro in attachments. I added the TAP test to amcheck > module just for simplicity. Indeed a good catch! Thanks for the TAP test. I've migrated it to async-notify.spec, included in 0001-optimize_listen_notify-v18.patch: * Check that notifications sent from a backend that has not done LISTEN are properly delivered to a listener in another backend To fix this, we now do an initChannelHash call at the beginning of SignalBackends, since the problem was that if no LISTEN had been done in the session which did a NOTIFY, the channel would not have been initiated. Added a point about this in the 0002-optimize_listen_notify-v18.patch header: * SignalBackends attaches to the channel hash at the start, ensuring that backends performing NOTIFY without having done LISTEN can still find listeners in the shared hash table. On Wed, Oct 15, 2025, at 16:16, Tom Lane wrote: > I think we can perhaps salvage the idea if we invent a separate > "advisory" queue position field, which tells its backend "hey, > you could skip as far as here if you want", but is not used for > purposes of SLRU truncation. Alternatively, split the queue pos > into "this is where to read next" and "this is as much as I'm > definitively done with", where the second field gets advanced at > the end of asyncQueueReadAllNotifications. Not sure which > view would be less confusing (in the end I guess they're nearly > the same thing, differently explained). > > A different line of thought could be to get rid of > asyncQueueReadAllNotifications's optimization of moving the > queue pos only once, per > > * (We could alternatively retake NotifyQueueLock and move the position > * before handling each individual message, but that seems like too much > * lock traffic.) > > Since we only need shared lock to advance our own queue pos, > maybe that wouldn't be too awful. Not sure. These all sounds like promising ideas. I went ahead and tried the "split the queue pos" idea, implemented in 0002-optimize_listen_notify-v18.patch: Position tracking for truncation safety ---------------------------------------- To prevent race conditions during queue truncation when using direct advancement, backend positions are now tracked using two fields: * pos: The next position to read from. This can be advanced by other backends via direct advancement to skip over uninteresting notifications. * donePos: What the backend has definitively processed and no longer needs. This is used for determining safe truncation points. Without this separation, a backend could be advanced by another backend while it's reading notifications, then write back its stale local position that points to an already-truncated page. By using donePos for truncation decisions and taking the maximum of local and shared pos when updating, we ensure that truncation waits for backends to finish reading, while still allowing position advancement for optimization. On Wed, Oct 15, 2025, at 21:53, Arseniy Mukhin wrote: > Advisory queue position field sounds good IMHO. Listeners are still > solely responsible for advancing their positions so they still need to > wake up to do it, but they will only do so if there are relevant > notifications, or if they are too far behind. In any case they will be > able to jump over all irrelevant stuff. I read your message too late, otherwise I would have tried that approach first. I will try to implement that one too, and perhaps also the third one, and then we can evaluate them to see which one we prefer. /Joel
Attachment
pgsql-hackers by date: