Re: expose parallel leader in CSV and log_line_prefix - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: expose parallel leader in CSV and log_line_prefix
Date
Msg-id 20200721035145.GB17300@paquier.xyz
Whole thread Raw
In response to Re: expose parallel leader in CSV and log_line_prefix  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Mon, Jul 20, 2020 at 06:30:48PM -0500, Justin Pryzby wrote:
> This thread is about a new feature that I proposed which isn't yet committed
> (logging leader_pid).  But it raises a question which is immediately relevant
> to pg_stat_activity.leader_pid, which is committed for v13.  So feel free to
> move to a new thread or to the thread for commit b025f3.

For a change of this size, with everybody involved in the past
discussion already on this thread, and knowing that you already
created an open item pointing to this part of the thread, I am not
sure that I would bother spawning a new thread now :)

> I see a couple options:
>
> - Update the documentation only, saying something like "leader_pid: the lock
>   group leader.  For a process involved in parallel query, this is the parallel
>   leader.  In particular, for the leader process itself, leader_pid = pid, and
>   it is not reset until the leader terminates (it does not change when parallel
>   workers exit).  This leaves in place the "raw" view of the data structure,
>   which can be desirable, but can be perceived as exposing unfriendly
>   implementation details.
>
> - Functional change to show leader_pid = NULL for the leader itself.  Maybe
>   the columns should only be not-NULL when st_backendType == B_BG_WORKER &&
>   bgw_type='parallel worker'.  Update documentation to say: "leader_pid: for
>   parallel workers, the PID of their leader process".  (not a raw view of the
>   "lock group leader").

Yeah, I don't mind revisiting that per the connection pooler argument.
And I'd rather keep the simple suggestion of upthread to leave the
field as NULL for the parallel group leader with a PID match but not a
backend type check so as this could be useful for other types of
processes.  This leads me to the attached with the docs updated
(tested with read-only pgbench spawning parallel workers with
pg_stat_activity queried in parallel), to be applied down to 13.
Thoughts are welcome.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Ajin Cherian
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Next
From: Justin Pryzby
Date:
Subject: Re: shared tempfile was not removed on statement_timeout