Re: adding wait_start column to pg_locks - Mailing list pgsql-hackers

From Ian Lawrence Barwick
Subject Re: adding wait_start column to pg_locks
Date
Msg-id CAB8KJ=hN=sz2+u3cQ+5jhZT9+TP-N7OCBLTp68L4fkVoVTqWjw@mail.gmail.com
Whole thread Raw
In response to Re: adding wait_start column to pg_locks  (torikoshia <torikoshia@oss.nttdata.com>)
Responses Re: adding wait_start column to pg_locks  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

Hi

2021年1月4日(月) 15:04 torikoshia <torikoshia@oss.nttdata.com>:
>
> On 2021-01-02 06:49, Justin Pryzby wrote:
> > On Tue, Dec 15, 2020 at 11:47:23AM +0900, torikoshia wrote:
> >> So I'm now thinking about adding a new column in pg_locks which
> >> keeps the time at which locks started waiting.
> >>
> >> Attached a patch.
> >
> > This is failing make check-world, would you send an updated patch ?
> >
> > I added you as an author so it shows up here.
> > http://cfbot.cputube.org/atsushi-torikoshi.html
>
> Thanks!
>
> Attached an updated patch.

I took a look at this patch as it seems useful (and I have an item on my bucket
list labelled "look at the locking code", which I am not at all familiar with).

I tested the patch by doing the following:

Session 1:

    postgres=# CREATE TABLE foo (id int);
    CREATE TABLE

    postgres=# BEGIN ;
    BEGIN

    postgres=*# INSERT INTO foo VALUES (1);
    INSERT 0 1

Session 2:

    postgres=# BEGIN ;
    BEGIN

    postgres=*# LOCK TABLE foo;

Session 3:

    postgres=# SELECT locktype, relation, pid, mode, granted, fastpath, wait_start
                 FROM pg_locks
                WHERE relation = 'foo'::regclass AND NOT granted\x\g\x

    -[ RECORD 1 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3513935
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:03:06.683053+09

So far so good, but checking *all* the locks on this relation:

    postgres=# SELECT locktype, relation, pid, mode, granted, fastpath, wait_start
                 FROM pg_locks
                WHERE relation = 'foo'::regclass\x\g\x

    -[ RECORD 1 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3513824
    mode       | RowExclusiveLock
    granted    | t
    fastpath   | f
    wait_start | 2021-01-14 12:03:06.683053+09
    -[ RECORD 2 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3513935
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:03:06.683053+09

shows the RowExclusiveLock granted in session 1 as apparently waiting from
exactly the same time as session 2's attempt to acquire the lock, which is clearly
not right.

Also, if a further session attempts to acquire a lock, we get:

    postgres=# SELECT locktype, relation, pid, mode, granted, fastpath, wait_start
                 FROM pg_locks
                WHERE relation = 'foo'::regclass\x\g\x

    -[ RECORD 1 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3513824
    mode       | RowExclusiveLock
    granted    | t
    fastpath   | f
    wait_start | 2021-01-14 12:05:53.747309+09
    -[ RECORD 2 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3514240
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:05:53.747309+09
    -[ RECORD 3 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3513935
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:05:53.747309+09

i.e. all entries now have "wait_start" set to the start time of the latest session's
lock acquisition attempt.

Looking at the code, this happens as the wait start time is being recorded in
the lock record itself, so always contains the value reported by the latest lock
acquisition attempt.

It looks like the logical place to store the value is in the PROCLOCK
structure; the attached patch reworks your patch to do that, and given the above
scenario produces following output:

    postgres=# SELECT locktype, relation, pid, mode, granted, fastpath, wait_start
                 FROM pg_locks
                WHERE relation = 'foo'::regclass\x\g\x

    -[ RECORD 1 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3516391
    mode       | RowExclusiveLock
    granted    | t
    fastpath   | f
    wait_start |
    -[ RECORD 2 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3516470
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:19:16.217163+09
    -[ RECORD 3 ]-----------------------------
    locktype   | relation
    relation   | 16452
    pid        | 3516968
    mode       | AccessExclusiveLock
    granted    | f
    fastpath   | f
    wait_start | 2021-01-14 12:18:08.195429+09

As mentioned, I'm not at all familiar with the locking code so it's quite
possible that it's incomplete,there may be non-obvious side-effects, or it's
the wrong approach altogether etc., so further review is necessary.


Regards

Ian Barwick

--
EnterpriseDB: https://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: A failure of standby to follow timeline switch
Next
From: Michael Paquier
Date:
Subject: Re: Moving other hex functions to /common