Thread: Help understanding SIReadLock growing without bound on completedtransaction
mode: SIReadLock
- It's my understanding that these locks should be cleared when there are no conflicting transactions. These locks had existed for > 1 week and we have no transactions that last more than a few seconds (the oldest transaction in pg_stat_activity is always < 1minute old).
- Why would a transaction that is finished continue accumulating locks over time?
- There were a total of six pids in pg_locks that didn't exist in pg_stat_activity. They held a variety of SIReadLocks, but they weren't increasing in number over time. I'm not sure how long they were present; I only know that the problematic pid existed for a week due to its continual growth reflecting in our internal lock monitoring system.
- I tried finding overlapping SIReadLocks (see query below), but none were returned (I realize that the SSI conflict resolution algo is much more involved than this simple query)
- PG version: 9.6.17
SELECT
waiting.locktype AS w_locktype,
LEFT(waiting.relation::regclass::text,25) AS waiting_table,
COALESCE(waiting_stm.query,'?') AS w_query,
waiting.page AS w_page,
waiting.tuple AS w_tuple,
waiting.pid AS w_pid,
other.locktype AS o_locktype,
LEFT(other.relation::regclass::text,15) AS other_table,
LEFT(COALESCE(other_stm.query, '?'), 50) AS other_query,
other.page AS o_page,
other.tuple AS o_tuple,
other.pid AS other_pid,
other.GRANTED AS o_granted
FROM
pg_catalog.pg_locks AS waiting
LEFT JOIN
pg_catalog.pg_stat_activity AS waiting_stm
ON waiting_stm.pid = waiting.pid
JOIN
pg_catalog.pg_locks AS other
ON (
(
waiting."database" = other."database"
AND waiting.relation = other.relation
and waiting.locktype = other.locktype
AND ( CASE WHEN other.locktype = 'page' THEN waiting.page IS NOT DISTINCT FROM other.page
WHEN other.locktype = 'tuple' THEN waiting.page IS NOT DISTINCT FROM other.page and waiting.tuple IS NOT DISTINCT FROM other.tuple
ELSE true END
)
)
OR waiting.transactionid = other.transactionid
) AND waiting.pid <> other.pid
LEFT JOIN
pg_catalog.pg_stat_activity AS other_stm
ON other_stm.pid = other.pid
WHERE waiting.pid IN (2263461, 2263276, 2263283, 2263284, 2263459, 2263527 )
Re: Help understanding SIReadLock growing without bound on completed transaction
On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@superhuman.com> wrote: > locktype: page > relation::regclass::text: <table>_pkey > virtualtransaction: 36/296299968 > granted:t > pid:2263461 That's an unusually high looking pid. Is that expected, for example did you crank Linux's pid_max right up, or is this AIX, or something?
Re: Help understanding SIReadLock growing without bound oncompleted transaction
On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@
superhuman. com> wrote: pid:2263461
That's an unusually high looking pid. Is that expected, for example did you crank Linux's pid_max right up, or is this AIX, or something?
PostgreSQL 9.6.16 on x86_64-pc-linux-gnu, compiled by clang version 7.0.0-3~ubuntu0.18.04.1 (tags/RELEASE_700/final), 64-bit
Re: Help understanding SIReadLock growing without bound on completed transaction
On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@superhuman.com> wrote: > It's my understanding that these locks should be cleared when there are no conflicting transactions. These locks had existedfor > 1 week and we have no transactions that last more than a few seconds (the oldest transaction in pg_stat_activityis always < 1minute old). > Why would a transaction that is finished continue accumulating locks over time? Predicate locks are released by ClearOldPredicateLocks(), which releases SERIALIZABLEXACTs once they are no longer interesting. It has a conservative idea of what is no longer interesting: it waits until the lowest xmin across active serializable snapshots is >= the transaction's finishedBefore xid, which was the system's next xid (an xid that hasn't been used yet*) at the time the SERIALIZABLEXACT committed. One implication of this scheme is that SERIALIZABLEXACTs are cleaned up in commit order. If you somehow got into a state where a few of them were being kept around for a long time, but others committed later were being cleaned up (which I suppose must be the case or your system would be complaining about running out of SERIALIZABLEXACTs), that might imply that there is a rare leak somewhere in this scheme. In the past I have wondered if there might be a problem with wraparound in the xid tracking for finished transactions, but I haven't worked out the details (transaction ID wraparound is both figuratively and literally the Ground Hog Day of PostgreSQL bug surfaces). *Interestingly, it takes an unlocked view of that value, but that doesn't seem relevant here; it could see a value that's too low, not too high.
Re: Help understanding SIReadLock growing without bound oncompleted transaction
Predicate locks are released by ClearOldPredicateLocks(), which releases SERIALIZABLEXACTs once they are no longer interesting. It has a conservative idea of what is no longer interesting: it waits until the lowest xmin across active serializable snapshots is >= the transaction's finishedBefore xid, which was the system's next xid (an xid that hasn't been used yet*) at the time the SERIALIZABLEXACT committed. One implication of this scheme is that SERIALIZABLEXACTs are cleaned up in commit order. If you somehow got into a state where a few of them were being kept around for a long time, but others committed later were being cleaned up (which I suppose must be the case or your system would be complaining about running out of SERIALIZABLEXACTs), that might imply that there is a rare leak somewhere in this scheme. In the past I have wondered if there might be a problem with wraparound in the xid tracking for finished transactions, but I haven't worked out the details (transaction ID wraparound is both figuratively and literally the Ground Hog Day of PostgreSQL bug surfaces).
datname | datfrozenxid | age
---------------+--------------+-----------
cloudsqladmin | 4173950091 | 169089900
template0 | 4266855294 | 76184697
postgres | 4173951306 | 169088685
template1 | 4266855860 | 76184131
superhuman | 4145766807 | 197273184
Re: Help understanding SIReadLock growing without bound oncompleted transaction
select max(backend_xid::text), min(backend_xmin::text) from pg_stat_activity where state='active';
max | min
----------+----------
50350294 | 50350065
On Fri, May 22, 2020 at 3:15 PM, Thomas Munro <thomas.munro@ gmail. com> wrote: Predicate locks are released by ClearOldPredicateLocks(), which releases SERIALIZABLEXACTs once they are no longer interesting. It has a conservative idea of what is no longer interesting: it waits until the lowest xmin across active serializable snapshots is >= the transaction's finishedBefore xid, which was the system's next xid (an xid that hasn't been used yet*) at the time the SERIALIZABLEXACT committed. One implication of this scheme is that SERIALIZABLEXACTs are cleaned up in commit order. If you somehow got into a state where a few of them were being kept around for a long time, but others committed later were being cleaned up (which I suppose must be the case or your system would be complaining about running out of SERIALIZABLEXACTs), that might imply that there is a rare leak somewhere in this scheme. In the past I have wondered if there might be a problem with wraparound in the xid tracking for finished transactions, but I haven't worked out the details (transaction ID wraparound is both figuratively and literally the Ground Hog Day of PostgreSQL bug surfaces).
Thanks for the detailed reply, Thomas. Is SERIALIZABLEXACT transaction ID wraparound the same as global xid wraparound? The max transaction age in the db is ~197M [1] so I don't think we've gotten close to global wraparound lately.Would it be helpful to cross-post this thread to pgsql-bugs or further investigate on my end-Mike[1] superhuman@production=> select datname, datfrozenxid, age(datfrozenxid) from pg_catalog.pg_database;datname | datfrozenxid | age
---------------+--------------+-----------
cloudsqladmin | 4173950091 | 169089900
template0 | 4266855294 | 76184697
postgres | 4173951306 | 169088685
template1 | 4266855860 | 76184131
superhuman | 4145766807 | 197273184