Looking at the postgresql code, the LW lock SerializableFinishedList appears to be acquired and then released, usually in the duration of a call to a procedure. Base on that (admittedly maybe faulty) view of the code, I am surprised to see the lock held open for over and hour and a half.
Hi Alec,
Clarification Lock_timeout and Statement_timeout work the same way, they just differ in what triggers them. Lock timeout is triggered when a lock can not be acquired in the specified time throwing an error. Statement_Timeout is triggered when any specific command/query can not be executed in the specific time.
the error message
could not truncate directory "pg_serial": apparent wraparound
This is concerning: do you have any replicas?? There is a known issue when hot_standby_feedback = on and replica has long running transaction running, SLRU will overflow causing PostgreSQL to go to a crawl. Only after the replica releases whatever locks it has does performance return to normal. This has been mitigated in PostgreSQL 17 with these configurable settings https://www.postgresql.org/docs/devel/runtime-config-resource.html#GUC-MULTIXACT-MEMBER-BUFFERS
I have no idea if SLRU directly affects serialized transactions, I would suspect that it does as SLRU affects all transactions and if it overflows bad things start happening. The next time this happen look at the replica to see if there are any long running transactions
All the conditions I can think of that can cause this is a transaction sitting out there holding onto an XID causing the serialized transactions to wait