Thread: NOTIFY queue is at 66% and climbing...

NOTIFY queue is at 66% and climbing...

From
Jeff Ross
Date:
Hi all,

On 10.15 I'm getting the following on a logically replicated server.

From the CSV logs:

2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:46.058 EDT,,,213601,,6143c257.34261,64244,,2021-09-16 18:16:55 EDT,4/3914855,60709905,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:51.934 EDT,,,213601,,6143c257.34261,64245,,2021-09-16 18:16:55 EDT,4/3914862,60709915,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:00.516 EDT,,,213601,,6143c257.34261,64246,,2021-09-16 18:16:55 EDT,4/3914864,60709917,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:08.003 EDT,,,213601,,6143c257.34261,64247,,2021-09-16 18:16:55 EDT,4/3914871,60709926,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""

The very fine docs for 10 say
 
"There is a queue that holds notifications that have been sent but not yet processed by all listening sessions. If this queue becomes full, transactions calling NOTIFY will fail at commit. The queue is quite large (8GB in a standard installation) and should be sufficiently sized for almost every use case. However, no cleanup can take place if a session executes LISTEN and then enters a transaction for a very long time. Once the queue is half full you will see warnings in the log file pointing you to the session that is preventing cleanup. In this case you should make sure that this session ends its current transaction so that cleanup can proceed."

We do have a client that has access to this logically replicated server and I have no idea what they have done regards LISTEN/NOTIFY, and my query to them about this has not yielded a response.

In the CSV logs above what part points to "the session that is preventing cleanup" so that I can kill it?  pg_stat_activity yields no clues.

Thanks,

Jeff

Re: NOTIFY queue is at 66% and climbing...

From
Tom Lane
Date:
Jeff Ross <jross@openvistas.net> writes:
> On 10.15 I'm getting the following on a logically replicated server.

> 2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 
> 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% 
> full",,,,,,,,,""

> In the CSV logs above what part points to "the session that is 
> preventing cleanup" so that I can kill it?

Normally there's a DETAIL entry citing the session's PID.  Looking
at the code, the reason for the lack of any such entry must be that
there is no session whose current notify queue position exactly
matches the supposed global minimum position.  This corresponds to
a known bug that was fixed in 10.16, so I'd suggest upgrading.
As a temporary workaround you could restart that server, but
likely the problem would recur after awhile.

            regards, tom lane



Re: NOTIFY queue is at 66% and climbing...

From
Jeff Ross
Date:

On 10/13/21 5:50 PM, Tom Lane wrote:
> Jeff Ross <jross@openvistas.net> writes:
>> On 10.15 I'm getting the following on a logically replicated server.
>> 2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16
>> 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66%
>> full",,,,,,,,,""
>> In the CSV logs above what part points to "the session that is
>> preventing cleanup" so that I can kill it?
> Normally there's a DETAIL entry citing the session's PID.  Looking
> at the code, the reason for the lack of any such entry must be that
> there is no session whose current notify queue position exactly
> matches the supposed global minimum position.  This corresponds to
> a known bug that was fixed in 10.16, so I'd suggest upgrading.
> As a temporary workaround you could restart that server, but
> likely the problem would recur after awhile.
>
>             regards, tom lane
>
Thanks as always, Tom.  I'll schedule the upgrade to 10.18 and restart 
for this weekend.

Jeff