Let's invent a function to report lock-wait-blocking PIDs - Mailing list pgsql-hackers

From Tom Lane
Subject Let's invent a function to report lock-wait-blocking PIDs
Date
Msg-id 7419.1363802552@sss.pgh.pa.us
Whole thread Raw
Responses Re: Let's invent a function to report lock-wait-blocking PIDs  (Robert Haas <robertmhaas@gmail.com>)
Re: Let's invent a function to report lock-wait-blocking PIDs  (Greg Stark <stark@mit.edu>)
Re: Let's invent a function to report lock-wait-blocking PIDs  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Let's invent a function to report lock-wait-blocking PIDs  (Noah Misch <noah@leadboat.com>)
Re: Let's invent a function to report lock-wait-blocking PIDs  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
I was just looking into why the -DCLOBBER_CACHE_ALWAYS buildfarm
critters aren't managing to run the new "timeouts" isolation test
successfully, despite very generous timeouts.  The answer is that
2 seconds isn't quite enough time to parse+plan+execute the query
that isolationtester uses to see if the current test session is
blocked on a lock, if CLOBBER_CACHE_ALWAYS is on.  Now, that query
is totally horrible:
   appendPQExpBufferStr(&wait_query,                        "SELECT 1 FROM pg_locks holder, pg_locks waiter "
            "WHERE NOT waiter.granted AND waiter.pid = $1 "                        "AND holder.granted "
       "AND holder.pid <> $1 AND holder.pid IN (");   /* The spec syntax requires at least one session; assume that
here.*/   appendPQExpBuffer(&wait_query, "%s", backend_pids[1]);   for (i = 2; i < nconns; i++)
appendPQExpBuffer(&wait_query,", %s", backend_pids[i]);   appendPQExpBufferStr(&wait_query,                        ")
"
                        "AND holder.mode = ANY (CASE waiter.mode "                        "WHEN 'AccessShareLock' THEN
ARRAY["                       "'AccessExclusiveLock'] "                        "WHEN 'RowShareLock' THEN ARRAY["
               "'ExclusiveLock',"                        "'AccessExclusiveLock'] "                        "WHEN
'RowExclusiveLock'THEN ARRAY["                        "'ShareLock',"                        "'ShareRowExclusiveLock',"
                     "'ExclusiveLock',"                        "'AccessExclusiveLock'] "                        "WHEN
'ShareUpdateExclusiveLock'THEN ARRAY["                        "'ShareUpdateExclusiveLock',"
"'ShareLock',"                       "'ShareRowExclusiveLock',"                        "'ExclusiveLock',"
        "'AccessExclusiveLock'] "                        "WHEN 'ShareLock' THEN ARRAY["
"'RowExclusiveLock',"                       "'ShareUpdateExclusiveLock',"
"'ShareRowExclusiveLock',"                       "'ExclusiveLock',"                        "'AccessExclusiveLock'] "
                   "WHEN 'ShareRowExclusiveLock' THEN ARRAY["                        "'RowExclusiveLock',"
         "'ShareUpdateExclusiveLock',"                        "'ShareLock',"
"'ShareRowExclusiveLock',"                       "'ExclusiveLock',"                        "'AccessExclusiveLock'] "
                   "WHEN 'ExclusiveLock' THEN ARRAY["                        "'RowShareLock',"
"'RowExclusiveLock',"                       "'ShareUpdateExclusiveLock',"                        "'ShareLock',"
              "'ShareRowExclusiveLock',"                        "'ExclusiveLock',"
"'AccessExclusiveLock']"                        "WHEN 'AccessExclusiveLock' THEN ARRAY["
"'AccessShareLock',"                       "'RowShareLock',"                        "'RowExclusiveLock',"
        "'ShareUpdateExclusiveLock',"                        "'ShareLock',"
"'ShareRowExclusiveLock',"                       "'ExclusiveLock',"                        "'AccessExclusiveLock'] END)
"
                 "AND holder.locktype IS NOT DISTINCT FROM waiter.locktype "                 "AND holder.database IS
NOTDISTINCT FROM waiter.database "                 "AND holder.relation IS NOT DISTINCT FROM waiter.relation "
             "AND holder.page IS NOT DISTINCT FROM waiter.page "                        "AND holder.tuple IS NOT
DISTINCTFROM waiter.tuple "             "AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid "       "AND
holder.transactionidIS NOT DISTINCT FROM waiter.transactionid "                   "AND holder.classid IS NOT DISTINCT
FROMwaiter.classid "                        "AND holder.objid IS NOT DISTINCT FROM waiter.objid "               "AND
holder.objsubidIS NOT DISTINCT FROM waiter.objsubid ");
 

This is way more knowledge than we (should) want a client to embed about
which lock types block which others.  What's worse, it's still wrong.
The query will find cases where one of the test sessions *directly*
blocks another one, but not cases where the blockage is indirect.
For example, consider that A holds AccessShareLock, B is waiting for
AccessExclusiveLock on the same object, and C is queued up behind B
for another AccessShareLock.  This query will not think that C is
blocked, not even if B is part of the set of sessions of interest
(because B will show the lock as not granted); but especially so if
B is not part of the set.

I think that such situations may not arise in the specific context that
isolationtester says it's worried about, which is to disregard waits for
locks held by autovacuum.  But in general, you can't reliably tell who's
blocking whom with a query like this.

If isolationtester were the only market for this type of information,
maybe it wouldn't be worth worrying about.  But I'm pretty sure that
there are a *lot* of monitoring applications out there that are trying
to extract who-blocks-whom information from pg_locks.  I hadn't realized
before quite how painful it is to do that, even incorrectly.

I propose that we should add a backend function that simplifies this
type of query.  The API that comes to mind is (name subject to
bikeshedding)
pg_blocking_pids(pid int) returns int[]

defined to return NULL if the argument isn't the PID of any backend or
that backend isn't waiting for a lock, and otherwise an array of the
PIDs of the backends that are blocking it from getting the lock.
I would compute the array as
PIDs of backends already holding conflicting locks,plus PIDs of backends requesting conflicting locks that areahead of
thisone in the lock's wait queue,plus PIDs of backends that block the latter group of PIDs(ie, are holding locks
conflictingwith their requests,or are awaiting such locks and are ahead of them in the queue)
 

There would be some cases where this definition would be too expansive,
ie we'd release the waiter after only some of the listed sessions had
released their lock or request.  (That could happen for instance if we
concluded we had to move up the waiter's request to escape a deadlock.)
But I think that it's better to err in that direction than to
underestimate the set of relevant PIDs.

In the isolationtester use-case, we'd get the right answer by testing
whether this function's result has any overlap with the set of PIDs of
test sessions, ie
select pg_blocking_pids($1) && array[pid1, pid2, pid3, ...]

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_upgrade segfaults when given an invalid PGSERVICE value
Next
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade segfaults when given an invalid PGSERVICE value