Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Optimize LISTEN/NOTIFY
Date
Msg-id 7556f0d4-03fd-451a-bd34-5f62b424319a@app.fastmail.com
Whole thread Raw
In response to Re: Optimize LISTEN/NOTIFY  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: Optimize LISTEN/NOTIFY
List pgsql-hackers
On Wed, Oct 29, 2025, at 08:05, Chao Li wrote:
>> On Oct 29, 2025, at 05:45, Joel Jacobson <joel@compiler.org> wrote:
>> I found a concurrency bug in v21 that could cause missed wakeup when a
>> backend would UNLISTEN on the last channel, which called
>> asyncQueueUnregister, and if wakeupPending was at that time already set,
>> then it wouldn't get reset, since in ProcessIncomingNotify we return
>> early if (listenChannels == NIL), so we would never clear wakeupPending
>> which happens in asyncQueueReadAllNotifications.
>>
>> Fixed by clearing wakeupPending in asyncQueueUnregister:
>>
>> @@ -1597,6 +1597,7 @@ asyncQueueUnregister(void)
>>    /* Mark our entry as invalid */
>>    QUEUE_BACKEND_PID(MyProcNumber) = InvalidPid;
>>    QUEUE_BACKEND_DBOID(MyProcNumber) = InvalidOid;
>> +   QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) = false;
>>    /* and remove it from the list */
>>    if (QUEUE_FIRST_LISTENER == MyProcNumber)
>>        QUEUE_FIRST_LISTENER = QUEUE_NEXT_LISTENER(MyProcNumber);
>>
>> /Joel<0001-optimize_listen_notify-v22.patch><0002-optimize_listen_notify-v22.patch>
>
> I think the current implementation still has a race problem.
>
> Let’s say notifier N1 notifies listener’s L1 to read message.
> L1 starts to read: it acquires the look, gets reading range, then
> releases the lock, start performs reading without holding the lock.
> Notifier N2 comes, N2 doesn’t have anything L1 is interested in. N2 now
> holds the look, when it checks "if (QUEUE_POS_EQUAL(pos,
> queueHeadBeforeWrite))”, here comes the race. Because the lock is in
> N2’s hand, L1 cannot get the lock to update its pos, so "if
> (QUEUE_POS_EQUAL(pos, queueHeadBeforeWrite))” will not be satisfied, so
> direct advancement won’t happen.

I'm not sure I agree that qualifies as a race "problem" per se, since I
think that just sounds like a case where we would do an unnecessary
wakeup, right?

Without more sophisticated data structures (e.g. skip ranges) and
increased code complexity, there will always be cases where we will by
do unnecessary wakeups, which IMO need not be a design goal to
completely avoid, until we have benchmark data that indicates otherwise.

I think we should iterate by first trying to reason about correctness of
the code, trying to prove/disprove if a notifications could ever end up
not being delivered. The bug I fixed in v22 is an example of such a
case, that would cause a listening backend to never be awaken, since
notifiers would not signal it due to the pending wake that was not
cleared.

I wonder if there could be more such serious bugs in the current code. I
will focus my efforts now trying to answer that question. Would be
really nice if we could find a way to reason formally about this. I've
been looking into the P programming language, which seems suitable for
modeling and verifying these kind of asynchronous concurrency protocols,
I will give it a try.

/Joel




pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Logical Replication of sequences
Next
From: Arne Roland
Date:
Subject: Re: apply_scanjoin_target_to_paths and partitionwise join