Re: expose parallel leader in CSV and log_line_prefix - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: expose parallel leader in CSV and log_line_prefix
Date
Msg-id 20200717205421.GR23581@telsasoft.com
Whole thread Raw
In response to Re: expose parallel leader in CSV and log_line_prefix  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: expose parallel leader in CSV and log_line_prefix  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Fri, Jul 17, 2020 at 11:35:40AM -0400, Alvaro Herrera wrote:
> > On Fri, Jul 17, 2020 at 7:01 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > > Hmm.  Knowing if a leader is actually running parallel query or not
> > > requires a lookup at lockGroupMembers, that itself requires a LWLock.
> > > I think that it would be better to not require that.  So what if
> > > instead we logged %P only if Myproc has lockGroupLeader set and it
> > > does *not* match MyProcPid?
> 
> That's what I said first, so +1 for that approach.

Ok, but should we then consider changing pg_stat_activity for consistency ?
Probably in v13 to avoid changing it a year later.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8

I think the story is that we're exposing to the user a "leader pid" what's
internally called (and used as) the "lock group leader", which for the leader
process is set to its own PID.  But I think what we're exposing as leader_pid
will seem like an implementation artifact to users.  It's unnatural to define a
leader PID for the leader itself, and I'm guessing that at least 30% of people
who use pg_stat_activity.leader_pid will be surprised by rows with
| backend_type='client backend' AND leader_pid IS NOT NULL
And maybe additionally confused if PSA doesn't match CSV or other log.

Right now, PSA will include processes "were leader" queries like:
| SELECT pid FROM pg_stat_activity WHERE pid=leader_pid
If we change it, I think you can get the same thing for a *current* leader like:
| SELECT pid FROM pg_stat_activity a WHERE EXISTS (SELECT 1 FROM pg_stat_activity b WHERE b.leader_pid=a.pid);
But once the children die, you can't get that anymore.  Is that a problem ?

I didn't think of it until now, but it would be useful to query logs for
processes which were involved in parallel process.  (It would be more useful if
it indicated the query, and not just the process)

I agree that showing the PID as the leader PID while using a connection pooler
is "noisy".  But I think that's maybe just a consequence of connection pooling.
As an analogy, I would normally use a query like:
| SELECT session_line, message, query FROM postgres_log WHERE session_id='..' ORDER BY 1
But that already doesn't work usefully with connection pooling (and I'm not
sure how to resolve that other than by not using pooling when logs are useful)

I'm not sure what the answer.  Probably we should either make both expose
lockGroupLeader exactly (and not filtered) or make both show lockGroupLeader
only if lockGroupLeader!=getpid().

-- 
Justin



pgsql-hackers by date:

Previous
From: Nikita Glukhov
Date:
Subject: Re: SQL/JSON: functions
Next
From: Alvaro Herrera
Date:
Subject: Re: expose parallel leader in CSV and log_line_prefix