Hi,
I ran into a problem with one of our schema monitoring queries today, which I ended up concluding to likely be caused by a combination of querying pg_locks on a recurring basis (once every 10 minutes), and a customer workload that used excessive predicate locks (> 100MB of total pg_locks data).
Our motivation for querying pg_locks in the first place is *not* to see all locks, but rather to check whether any relations are exclusively locked, so we can avoid calling pg_get_indexdef or pg_get_expr on catalog entries that relate to the exclusively locked table, and having our schema monitoring query time out because of a single table that's exclusively locked.
Whilst digging through the lock manager code for anything faster than pg_locks, I found GetRunningTransactionLocks, currently used for the XLOG_STANDBY_LOCK WAL record, which is exactly the information I was looking for.
The main performance benefit being that it skips over the predicate locks information (as well as fast-path locks), and returns no other data except for exclusive locks.
The attached patched repurposes that function (and renames it for clarity), and exposes it as a view called "pg_exclusive_locks":
postgres=# SELECT * FROM pg_exclusive_locks;
-[ RECORD 1 ]-+------
database | 5
relation | 16384
transactionid | 755
pid | 33030
This can then be used in schema information gathering queries like this:
WITH locked_relids AS (SELECT relation FROM pg_exclusive_locks)
SELECT pg_catalog.pg_get_indexdef(i.indexrelid, 0, FALSE)
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_index i ON (c.oid = i.indrelid)
AND c.oid NOT IN (SELECT relation FROM locked_relids)
Worth noting that particular use of it has a race condition where pg_get_indexdef could still run into a lock taken after the lock manager was queried, but in practice this is infrequent enough that it still helps (based on our experience doing this with pg_locks for a while).
In a different use case this could also be used to understand better what happened (i.e. which locks were held) when the XLOG_STANDBY_LOCK record was emitted and queries on standbys ran into unexpected locking issues.
Thoughts?
Thanks,
Lukas
--