Re: Expose lock group leader pid in pg_stat_activity - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Expose lock group leader pid in pg_stat_activity
Date
Msg-id 20200116072727.GI3117@paquier.xyz
Whole thread Raw
In response to Re: Expose lock group leader pid in pg_stat_activity  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Expose lock group leader pid in pg_stat_activity  (Michael Paquier <michael@paquier.xyz>)
Re: Expose lock group leader pid in pg_stat_activity  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Fri, Dec 27, 2019 at 10:15:33AM +0100, Julien Rouhaud wrote:
> I think that not using "parallel" to name this field will help to
> avoid confusion if the lock group infrastructure is eventually used
> for something else, but that's only true if indeed we explain what a
> lock group is.

As you already pointed out, src/backend/storage/lmgr/README includes a
full description of this stuff under the section "Group Locking".  So
I agree that the patch ought to document your new field in a much
better way, without mentioning the term "group locking" that's even
better to not confuse the reader because this term not present in the
docs at all.

> The leader_pid is NULL for processes not involved in parallel query.
> When a process wants to cooperate with parallel workers, it becomes a
> lock group leader, which means that this field will be valued to its
> own pid. When a parallel worker starts up, this field will be valued
> with the leader pid.

The first sentence is good to have.  Now instead of "lock group
leader", I think that we had better use "parallel group leader" as in
other parts of the docs (see wait events for example).  Then we just
need to say that if leader_pid has the same value as
pg_stat_activity.pid, then we have a group leader.  If not, then it is
a parallel worker process initially spawned by the leader whose PID is
leader_pid (when executing Gather, Gather Merge, soon-to-be parallel
vacuum or for a parallel btree build, but this does not need a mention
in the docs).  There could be an argument as well to have leader_pid
set to NULL for a leader, but that would be inconsistent with what
the PGPROC entry reports for the backend.

While looking at the code, I think that we could refactor things a bit
for raw_wait_event, wait_event_type and wait_event which has some
duplicated code for backend and auxiliary processes.  What about
filling in the wait event data after fetching the PGPROC entry, and
also fill in leader_pid for auxiliary processes.  This does not matter
now, perhaps it will never matter (or not), but that would make the
code much more consistent.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: [Proposal] Global temporary tables
Next
From: Michael Paquier
Date:
Subject: Re: Expose lock group leader pid in pg_stat_activity