Thread: Help understanding SIReadLock growing without bound on completedtransaction

We recently experienced an issue where a transaction that was finished and no longer existed kept slowly accumulating SIReadLocks over the period of a week.  The only way we could remove the locks was by restarting postgresql.

The entries in pg_locks resembled:

mode: SIReadLock              

locktype: page         
relation::regclass::text: <table>_pkey                                      
virtualtransaction: 36/296299968
granted:t      
pid:2263461
count(1):  5559 (when grouped)

Note that this pid did not exist in pg_stat_activity.  I understand that it is normal for SSI locks to persist after a transaction is finished.  There are however two aspects to this that I don't understand:
  1. It's my understanding that these locks should be cleared when there are no conflicting transactions.  These locks had existed for > 1 week and we have no transactions that last more than a few seconds (the oldest transaction in pg_stat_activity is always < 1minute old).
  2. Why would a transaction that is finished continue accumulating locks over time?  
If it helps, here is some more information about the state of the system:
  1. There were a total of six pids in pg_locks that didn't exist in pg_stat_activity.  They held a variety of SIReadLocks, but they weren't increasing in number over time.  I'm not sure how long they were present; I only know that the problematic pid existed for a week due to its continual growth reflecting in our internal lock monitoring system.
  2. I tried finding overlapping SIReadLocks (see query below), but none were returned (I realize that the SSI conflict resolution algo is much more involved than this simple query)
  3. PG version: 9.6.17
I would appreciate any hints of what I could've done to investigate this further or how I could've resolved the issue without restarting the db (and thus experiencing downtime).

thank you in advance,
-Mike

SELECT 

    waiting.locktype           AS w_locktype,

    LEFT(waiting.relation::regclass::text,25) AS waiting_table,

    COALESCE(waiting_stm.query,'?')          AS w_query,

    waiting.page AS w_page,

    waiting.tuple AS w_tuple,

    waiting.pid                AS w_pid,

    other.locktype             AS o_locktype,

    LEFT(other.relation::regclass::text,15)   AS other_table,

    LEFT(COALESCE(other_stm.query, '?'), 50)            AS other_query,

    other.page AS o_page,

    other.tuple AS o_tuple, 

    other.pid                  AS other_pid,

    other.GRANTED              AS o_granted

FROM

    pg_catalog.pg_locks AS waiting

LEFT JOIN

    pg_catalog.pg_stat_activity AS waiting_stm

    ON        waiting_stm.pid = waiting.pid

JOIN

    pg_catalog.pg_locks AS other

    ON (

        (

            waiting."database" = other."database"

        AND waiting.relation  = other.relation

        and waiting.locktype = other.locktype

        AND ( CASE WHEN other.locktype = 'page' THEN waiting.page IS NOT DISTINCT FROM other.page

           WHEN other.locktype = 'tuple' THEN waiting.page IS NOT DISTINCT FROM other.page and waiting.tuple IS NOT DISTINCT FROM other.tuple

           ELSE true END

        )

        )

        OR waiting.transactionid = other.transactionid

    ) AND waiting.pid <> other.pid

LEFT JOIN

    pg_catalog.pg_stat_activity AS other_stm

    ON        other_stm.pid = other.pid

WHERE waiting.pid IN (2263461, 2263276, 2263283, 2263284, 2263459, 2263527 )



On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@superhuman.com> wrote:
> locktype: page
> relation::regclass::text: <table>_pkey
> virtualtransaction: 36/296299968
> granted:t
> pid:2263461

That's an unusually high looking pid.  Is that expected, for example
did you crank Linux's pid_max right up, or is this AIX, or something?




On Thu, May 21, 2020 at 5:19 PM, Thomas Munro <thomas.munro@gmail.com> wrote:

On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@superhuman.com> wrote:

pid:2263461

That's an unusually high looking pid. Is that expected, for example did you crank Linux's pid_max right up, or is this AIX, or something?


Unfortunately I'm not sure exactly what it's running on as it's a cloud-provided database instance running on google cloud:

=> select version();

PostgreSQL 9.6.16 on x86_64-pc-linux-gnu, compiled by clang version 7.0.0-3~ubuntu0.18.04.1 (tags/RELEASE_700/final), 64-bit


-Mike

On Fri, May 22, 2020 at 7:48 AM Mike Klaas <mike@superhuman.com> wrote:
> It's my understanding that these locks should be cleared when there are no conflicting transactions.  These locks had
existedfor > 1 week and we have no transactions that last more than a few seconds (the oldest transaction in
pg_stat_activityis always < 1minute old). 
> Why would a transaction that is finished continue accumulating locks over time?

Predicate locks are released by ClearOldPredicateLocks(), which
releases SERIALIZABLEXACTs once they are no longer interesting.  It
has a  conservative idea of what is no longer interesting: it waits
until the lowest xmin across active serializable snapshots is >= the
transaction's finishedBefore xid, which was the system's next xid (an
xid that hasn't been used yet*) at the time the SERIALIZABLEXACT
committed.  One implication of this scheme is that SERIALIZABLEXACTs
are cleaned up in commit order.  If you somehow got into a state where
a few of them were being kept around for a long time, but others
committed later were being cleaned up (which I suppose must be the
case or your system would be complaining about running out of
SERIALIZABLEXACTs), that might imply that there is a rare leak
somewhere in this scheme.  In the past I have wondered if there might
be a problem with wraparound in the xid tracking for finished
transactions, but I haven't worked out the details (transaction ID
wraparound is both figuratively and literally the Ground Hog Day of
PostgreSQL bug surfaces).

*Interestingly, it takes an unlocked view of that value, but that
doesn't seem relevant here; it could see a value that's too low, not
too high.





On Fri, May 22, 2020 at 3:15 PM, Thomas Munro <thomas.munro@gmail.com> wrote:

Predicate locks are released by ClearOldPredicateLocks(), which releases SERIALIZABLEXACTs once they are no longer interesting. It has a conservative idea of what is no longer interesting: it waits until the lowest xmin across active serializable snapshots is >= the transaction's finishedBefore xid, which was the system's next xid (an xid that hasn't been used yet*) at the time the SERIALIZABLEXACT committed. One implication of this scheme is that SERIALIZABLEXACTs are cleaned up in commit order. If you somehow got into a state where a few of them were being kept around for a long time, but others committed later were being cleaned up (which I suppose must be the case or your system would be complaining about running out of SERIALIZABLEXACTs), that might imply that there is a rare leak somewhere in this scheme. In the past I have wondered if there might be a problem with wraparound in the xid tracking for finished transactions, but I haven't worked out the details (transaction ID wraparound is both figuratively and literally the Ground Hog Day of PostgreSQL bug surfaces).


Thanks for the detailed reply, Thomas.  Is SERIALIZABLEXACT transaction ID wraparound the same as global xid wraparound?  The max transaction age in the db is ~197M [1] so I don't think we've gotten close to global wraparound lately.

Would it be helpful to cross-post this thread to pgsql-bugs or further investigate on my end

-Mike

[1] superhuman@production=> select datname, datfrozenxid, age(datfrozenxid) from pg_catalog.pg_database;

    datname    | datfrozenxid |    age    

---------------+--------------+-----------

 cloudsqladmin |   4173950091 | 169089900

 template0     |   4266855294 |  76184697

 postgres      |   4173951306 | 169088685

 template1     |   4266855860 |  76184131

 superhuman    |   4145766807 | 197273184



On second look, it does seems the xid crossed the 2^32 mark recently, since most tables have a frozenxid close to 4b and the current xid is ~50m:

SELECT relname, age(relfrozenxid), relfrozenxid FROM pg_class WHERE relkind = 'r' and relname not like 'pg%' order by relname;
          relname          |    age    | relfrozenxid
---------------------------+-----------+--------------
<table name>              | 107232506 |   4237961815
<table name>              |  93692362 |   4251501959
<table name>              | 183484103 |   4161710218
<table name>              |  50760536 |   4294433785
<table name>              |  58821410 |   4286372911
<table name>              | 117427283 |   4227767038
<table name>              |  94541111 |   4250653210


select max(backend_xid::text), min(backend_xmin::text) from pg_stat_activity where state='active';

   max    |   min    

----------+----------

 50350294 | 50350065


-Mike


On Tue, May 26, 2020 at 8:42 AM, Mike Klaas <mike@superhuman.com> wrote:
On Fri, May 22, 2020 at 3:15 PM, Thomas Munro <thomas.munro@gmail.com> wrote:

Predicate locks are released by ClearOldPredicateLocks(), which releases SERIALIZABLEXACTs once they are no longer interesting. It has a conservative idea of what is no longer interesting: it waits until the lowest xmin across active serializable snapshots is >= the transaction's finishedBefore xid, which was the system's next xid (an xid that hasn't been used yet*) at the time the SERIALIZABLEXACT committed. One implication of this scheme is that SERIALIZABLEXACTs are cleaned up in commit order. If you somehow got into a state where a few of them were being kept around for a long time, but others committed later were being cleaned up (which I suppose must be the case or your system would be complaining about running out of SERIALIZABLEXACTs), that might imply that there is a rare leak somewhere in this scheme. In the past I have wondered if there might be a problem with wraparound in the xid tracking for finished transactions, but I haven't worked out the details (transaction ID wraparound is both figuratively and literally the Ground Hog Day of PostgreSQL bug surfaces).


Thanks for the detailed reply, Thomas.  Is SERIALIZABLEXACT transaction ID wraparound the same as global xid wraparound?  The max transaction age in the db is ~197M [1] so I don't think we've gotten close to global wraparound lately.

Would it be helpful to cross-post this thread to pgsql-bugs or further investigate on my end

-Mike

[1] superhuman@production=> select datname, datfrozenxid, age(datfrozenxid) from pg_catalog.pg_database;

    datname    | datfrozenxid |    age    

---------------+--------------+-----------

 cloudsqladmin |   4173950091 | 169089900

 template0     |   4266855294 |  76184697

 postgres      |   4173951306 | 169088685

 template1     |   4266855860 |  76184131

 superhuman    |   4145766807 | 197273184