Thread: docs: LISTEN/NOTIFY performance considerations

docs: LISTEN/NOTIFY performance considerations

From
Nikolay Samokhvalov
Date:
Greetings!

LISTEN/NOTIFY has known performance issues that aren't documented but regularly surprise users in production – my customers and I encountered some of them multiple times.

They were also discussed in the past, e.g.,:
- 2008: https://www.postgresql.org/message-id/5215.1204048454@sss.pgh.pa.us
- 2013: https://www.postgresql.org/message-id/3598.1363354686@sss.pgh.pa.us

Recently, Recall.ai had production outages from hitting these exact issues:
https://www.recall.ai/blog/postgres-listen-notify-does-not-scale (popped up to no.1 position on HN right now: https://news.ycombinator.com/item?id=44490510).

It's probably a good time to consider improving this area, but while it's not happening, I propose documenting the risks to help users avoid incidents (backpatching to all supported versions).

The proposed docs patch to the LISTEN/NOTIFY docs includes words about:
1. Global lock during commit affecting all databases in cluster
2. O(N²) duplicate checking performance
3. How to diagnose (log_lock_waits example)
4. Alternatives (logical decoding)


Thoughts?

Nik
Attachment