Re: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Planning counters in pg_stat_statements (using pgss_store)
Date
Msg-id CAOBaU_b3-YYooA5swR+=t5QzjFRnWhxNuZWhNtYb=dwQWq__FQ@mail.gmail.com
Whole thread Raw
In response to Re: Planning counters in pg_stat_statements (using pgss_store)  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Planning counters in pg_stat_statements (using pgss_store)  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2020/03/27 19:00, Julien Rouhaud wrote:
> > On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:
> >> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
> >>>
> >>> Here are other comments.
> >>>
> >>> -           if (jstate)
> >>> +           if (kind == PGSS_JUMBLE)
> >>>
> >>> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.
> >>>
> >>> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
> >>> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?
> >>
> >> Yes, we could be using jstate here.  I originally used that to avoid passing
> >> PGSS_EXEC (or the other one) as a way to say "ignore this information as
> >> there's the jstate which says it's yet another meaning".  If that's not an
> >> issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit "2"
> >> all over the place.
> >
> > Done, passing PGSS_PLAN when jumble is intended, with a comment saying that the
> > pgss_kind is ignored in that case.
> >
> >>> +      <entry><structfield>total_time</structfield></entry>
> >>> +      <entry><type>double precision</type></entry>
> >>> +      <entry></entry>
> >>> +      <entry>
> >>> +        Total time spend planning and executing the statement, in milliseconds
> >>> +      </entry>
> >>> +     </row>
> >>>
> >>> pg_stat_statements view has this column but the function not.
> >>> We should make both have the column or not at all, for consistency?
> >>> I'm not sure if it's good thing to expose the sum of total_plan_time
> >>> and total_exec_time as total_time. If some users want that, they can
> >>> easily calculate it from total_plan_time and total_exec_time by using
> >>> their own logic.
> >>
> >> I think we originally added it as a way to avoid too much compatibility break,
> >> and also because it seems like a field most users will be interested in anyway.
> >> Now that I'm thinking about it again, I indeed think it was a mistake to have
> >> that in view part only.  Not mainly for consistency, but for users who would be
> >> interested in the total_time field while not wanting to pay the overhead of
> >> retrieving the query text if they don't need it.  So I'll change that!
> >
> > Done
> >
>
> Thanks for updating the patch! But I'm still wondering if it's really
> good thing to expose total_time. For example, when the query fails
> with an error many times and "calls" becomes very different from
> "plans", "total_plan_time" + "total_exec_time" is really what the users
> are interested in?

That's also the case when running explain without analyze, or prepared
statements that fallback to generic plans.  As a user, knowing how
long postgres actually spent processing a query is interesting as a
way to find likely low hanging fruits, even if there's no
planning/execution strict correlation.  The planning/execution detail
is also useful but that's probably not what I'd be starting from (at
least in OLTP workload).

The error scenario is unfortunate, but that's yet another topic.

> Some users may be interested in the sum of mean
> times, but it's not exposed...

Yes, we had a discussion about summing the other fields, but it seems
to me that doing a sum of computed fields doesn't really make sense.
Mean without variance is already not that useful.

> So what I'd like to say is that the information that users are interested
> in would vary on each situation and case. At least for me it seems
> enough for pgss to report only the basic information. Then users
> can calculate to get the numbers (like total_time) they're interested in,
> from those basic information.
>
> But of course, I'd like to hear more opinions about this...

+1

Unless someone chime in by tomorrow, I'll just drop the sum as it
seems less controversial and not a blocker in userland if users are
interested.

>
> +               if (api_version >= PGSS_V1_8)
> +                       values[i++] = Int64GetDatumFast(tmp.total_time[0] +
> +                                                                                       tmp.total_time[1]);
>
> BTW, Int64GetDatumFast() should be Float8GetDatumFast()?

Oh indeed, embarrassing copy/pasto.

>
> >>> +           nested_level++;
> >>> +           PG_TRY();
> >>>
> >>> In old thread [1], Tom Lane commented the usage of nested_level
> >>> in the planner hook. There seems no reply to that so far. What's
> >>> your opinion about that comment?
> >>>
> >>> [1] https://www.postgresql.org/message-id/28980.1515803777@sss.pgh.pa.us
> >>
> >> Oh thanks, I didn't noticed this part of the discussion.  I agree with Tom's
> >> concern, and I think that having a specific nesting level variable for the
> >> planner is the best workaround, so I'll implement that.
> >
> > Done.
> >
> > I also exported BufferUsageAccumDiff as mentioned previously, as it seems
> > clearner and will avoid future useless code churn, and run pgindent.
>
> Many thanks!! I'm thinking to commit this part separately.
> So I made that patch based on your patch. Attached.

Thanks! It looks good to me.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)
Next
From: David Steele
Date:
Subject: Re: Asymmetric partition-wise JOIN