Re: Let's invent a function to report lock-wait-blocking PIDs - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Let's invent a function to report lock-wait-blocking PIDs
Date
Msg-id 514F0D04.5040404@2ndQuadrant.com
Whole thread Raw
In response to Let's invent a function to report lock-wait-blocking PIDs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Let's invent a function to report lock-wait-blocking PIDs  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On 3/20/13 2:02 PM, Tom Lane wrote:
> If isolationtester were the only market for this type of information,
> maybe it wouldn't be worth worrying about.  But I'm pretty sure that
> there are a *lot* of monitoring applications out there that are trying
> to extract who-blocks-whom information from pg_locks.  I hadn't realized
> before quite how painful it is to do that, even incorrectly.

As a FYI, the one Marco wrote here is over 100 lines of code, and while 
he did a great job I'd still never suggest we release it--because it's 
misleading in just enough cases to be dangerous.  We can run it 
usefully, but I'd never hand this over to a customer and expect them to 
do something with it.

> I propose that we should add a backend function that simplifies this
> type of query.  The API that comes to mind is (name subject to
> bikeshedding)
>
>     pg_blocking_pids(pid int) returns int[]

I think there's a whole family of functions like this needed.  This is 
one of them, so if it helps the isolation tester I'd be happy to see it 
added as a first one, whether or not more come along one day.

I'd rather get the data back as a SRF because I'd usually be joining it 
to pg_locks and/or pg_stat_activity to figure out what the blocking pids 
own or are doing.  You can obviously convert the array form to/from the 
SRF form.  The exposed function API that is easier for users to join 
with is my preference.  If the isolation tester is easier to write 
against the array form, it can play the appropriate nesting game to do 
so.  I see that as the unusual case though, and it is also the one being 
coded by people who know how to handle the conversion.

The longer list of views/functions I keep wanting includes things like:

-What processes are blocking P from running?  [This new function]

-What processes hold locks and are running usefully--they have some 
locks but all are granted?  [Easy to extract from pg_locks]

-For each running process, which processes are waiting on them? 
[Requires a long WITH RECURSIVE query that doesn't get trapped by 
circular locks]

-If I try to grab lock type L on object O, what existing locks will that 
conflict with?

One really magic thing I'd like in this area is EXPLAIN (ANALYZE ON, 
LOCKS ON) which pops out a list of all the locks acquired when running 
that statement.  We're never going to get fully correct documentation of 
what locks a given statement needs.  If I can figure that out in a test 
environment by running the statement there and seeing what locks it 
grabbed along the way, that would eliminate most of the need for 
documenting things.

Note that an EXPLAIN based approach doesn't solve all the problems in 
this area, because the trickiest ones I run into are ALTER TABLE 
changes--which you can't EXPLAIN.  Some API that dumps the locks an 
arbitrary statement acquired just before it exits would be ideal.  When 
a user can ask "what locks did an ALTER TABLE adding a foreign key take 
and what order were they grabbed in?", that would solve the hardest of 
the questions I see in the field.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Limiting setting of hint bits by read-only queries; vacuum_delay
Next
From: Andrew Dunstan
Date:
Subject: parallel pg_dump