Scott Frazer wrote:
> Hi, we have a Postgres 9.6 setup using replication that has recently started seeing a lot of processes stuck in
> "SubtransControlLock" as a wait_event on the read-replicas. Like this, only usually about 300-800 of them:
>
>
> 179706 | LWLockNamed | SubtransControlLock
> 186602 | LWLockNamed | SubtransControlLock
> 186606 | LWLockNamed | SubtransControlLock
> 180947 | LWLockNamed | SubtransControlLock
> 186621 | LWLockNamed | SubtransControlLock
>
> The server then begins to crawl, with some queries just never finishing until I finally shut the server down.
>
> Searching for that particular combo of wait_event_type and wait_event only seems to turn up the page
> about statistics collection, but no helpful information on troubleshooting this lock.
>
> Restarting the replica server clears the locks and allows us to start working again, but it's happened
> twice now in 12 hours and I'm worried it will happen again.
>
> Does anyone have any advice on where to start looking?
I don't think there is any connection to statistics collection.
This lock is used when subtransactions (SAVEPOINTs in SQL or EXCEPTION blocks
in PL/pgSQL) are created, read or removed.
This sounds like a PostgreSQL bug.
What is the exact PostgreSQL version you are running?
It would be cool if you could get a stack trace from the backend that is holding the lock.
Yours,
Laurenz Albe