Re: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Planning counters in pg_stat_statements (using pgss_store)
Date
Msg-id CAOBaU_bzrbAO2amK55AQF3=ivgX6YjOBzZ4bBiHKf_faLK5-Gg@mail.gmail.com
Whole thread Raw
In response to RE: Planning counters in pg_stat_statements (using pgss_store)  ("imai.yoshikazu@fujitsu.com" <imai.yoshikazu@fujitsu.com>)
Responses RE: Planning counters in pg_stat_statements (using pgss_store)  ("imai.yoshikazu@fujitsu.com" <imai.yoshikazu@fujitsu.com>)
List pgsql-hackers
On Tue, Nov 12, 2019 at 5:41 AM imai.yoshikazu@fujitsu.com
<imai.yoshikazu@fujitsu.com> wrote:
>
> On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> >
> > I attach v3 patches implementing those counters.
>
> Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
>
>
> > Note that to avoid duplicating some code (related to Welford's method),
> > I switched some of the counters to arrays rather than scalar variables.  It unfortunately makes
> > pg_stat_statements_internal() a little bit messy, but I hope that it's still acceptable.
>
> Yeah, I also think it's acceptable, but I think the codes like below one is more
> understandable than using for loop in pg_stat_statements_internal(),
> although some codes will be duplicated.
>
> pg_stat_statements_internal():
>
> if (api_version >= PGSS_V1_8)
> {
>     kind = PGSS_PLAN;
>     values[i++] = Int64GetDatumFast(tmp.calls[kind]);
>     values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
>     values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
> kind = PGSS_EXEC;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> if (api_version >= PGSS_V1_3)
> {
>     values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
>     values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
>     values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
>
> stddev(Counters counters)
> {
>     /*
>      * Note we are calculating the population variance here, not the
>      * sample variance, as we have data for the whole population, so
>      * Bessel's correction is not used, and we don't divide by
>      * tmp.calls - 1.
>      */
>     if (counters.calls[kind] > 1)
>         return stddev = sqrt(counters.sum_var_time[kind] / counters.calls[kind]);
>
>     return 0.0;
> }

Yes, that's also a possibility (though this should also take the
"kind" as parameter).  I wanted to avoid adding a new function and
save some redundant code, but I can change it in the next version of
the patch if needed.

> > While doing this refactoring
> > I saw that previous patches were failing to accumulate the buffers used during planning, which is now fixed.
>
> Checked.
> Now buffer usage columns include buffer usage during planning and executing,
> but if we turn off track_planning, buffer usage during planning is also not
> tracked which is good for users who don't want to take into account of that.

Indeed.  Note that there's a similar discussion on adding planning
buffer counters to explain in [1].  I'm unsure if merging planning and
execution counters in the same columns is ok or not.

> What I'm concerned about is column names will not be backward-compatible.
> {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> correctly show the meaning of its value, but we can't use
> {total, min, max, mean, stddev}_time anymore and it might be harmful for
> some users.
> I don't come up with any good idea for that...

Well, perhaps keeping the old {total, min, max, mean, stddev}_time
would be ok, as those historically meant "execution".  I don't have a
strong opinion here.

[1] https://www.postgresql.org/message-id/20191112205506.rvadbx2dnku3paaw@alap3.anarazel.de



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6vs 10.10
Next
From: Pavel Stehule
Date:
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6vs 10.10