Ben Hoskings <ben@hoskings.net> writes:
> We have a postgres 11.9 instance serving 7-8k queries/sec. In general
> it's humming along with no issues (p50 patency of ~1ms; p95 ~30ms).
> Lately though, we've seen occasional locking that causes all commits
> to the main database to be briefly blocked and the APIs that it backs
> to stall.
> ...
> * We make heavy use of listen/notify via que with ~80 notify events
> per second across 16 listen channels. (https://github.com/que-rb/que)
Possibly you'd benefit from updating to v13, which has the listen/notify
performance improvements Martijn referred to in the other thread.
It's also possible that the hangup is unrelated to that, being somewhere
later in commit processing than where the notify traffic gets dumped out.
If you could identify which process has the "database 0" lock during one
of these delays, and capture a stack trace from it, that would be
super-informative. Not sure about a good way to do that though.
Maybe you could automate tailing the log for "DETAIL: Process holding the
lock: nnn." and gdb'ing that process.
regards, tom lane