Re: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Planning counters in pg_stat_statements (using pgss_store)
Date
Msg-id c20a2e2f-527d-cffd-a2c7-dde2bee23b3f@oss.nttdata.com
Whole thread Raw
In response to Re: Planning counters in pg_stat_statements (using pgss_store)  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Planning counters in pg_stat_statements (using pgss_store)  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers

On 2020/03/25 22:45, Julien Rouhaud wrote:
> On Wed, Mar 25, 2020 at 10:09:37PM +0900, Fujii Masao wrote:
>>
>> On 2020/03/21 4:30, Julien Rouhaud wrote:
>>> On Fri, Mar 20, 2020 at 05:09:05PM +0300, Sergei Kornilov wrote:
>>>> Hello
>>>>
>>>> Yet another is missed in docs: total_time
>>>
>>> Oh good catch!  I rechecked many time the field, and totally missed that the
>>> documentation is referring to the view, which has an additional column, and not
>>> the function.  Attached v9 fixes that.
>>
>> Thanks for the patch! Here are the review comments from me.
>>
>> -    PGSS_V1_3
>> +    PGSS_V1_3,
>> +    PGSS_V1_8
>>
>> WAL usage patch [1] increments this version to 1_4 instead of 1_8.
>> I *guess* that's because previously this version was maintained
>> independently from pg_stat_statements' version. For example,
>> pg_stat_statements 1.4 seems to have used PGSS_V1_3.
> 
> Oh right.  It seems that I changed that many versions ago, I'm not sure why.
> I'm personally fine with any, but I think this was previously raised and
> consensus was to keep distinct counters.  Unless you prefer to keep it this
> way, I'll send an updated version (with other possible modifications depending
> on the rest of the mail) using PGSS_V1_4.
> 
>> +    /*
>> +     * We can't process the query if no query_text is provided, as pgss_store
>> +     * needs it.  We also ignore query without queryid, as it would be treated
>> +     * as a utility statement, which may not be the case.
>> +     */
>>
>> Could you tell me why the planning stats are not tracked when executing
>> utility statements? In some utility statements like REFRESH MATERIALIZED VIEW,
>> the planner would work.
> 
> I explained that in [1].  The problem is that the underlying statement doesn't
> get the proper stmt_location and stmt_len, so you eventually end up with two
> different entries.

It's not problematic to have two different entries in that case. Right?
The actual problem is that the statements reported in those entries are
very similar? For example, when "create table test as select 1;" is executed,
it's strange to get the following two entries, as you explained.

     create table test as select 1;
     create table test as select 1

But it seems valid to get the following two entries in that case?

     select 1
     create table test as select 1

The former is the nested statement and the latter is the top statement.

> I suggested fixing transformTopLevelStmt() to handle the
> various DDL that can contain optimisable statements, but everyone preferred to
> postpone that for a future enhencement.

Understood. Thanks for explanation!

>> +static BufferUsage
>> +compute_buffer_counters(BufferUsage start, BufferUsage stop)
>> +{
>> +    BufferUsage result;
>>
>> BufferUsageAccumDiff() has very similar logic. Isn't it better to expose
>> and use that function rather than creating new similar function?
> 
> Oh, I thought this wouldn't be acceptable.  That's indeed better so I'll do
> that instead.

Thanks! But of course this is trivial thing, so it's ok to do that later.

>>           values[i++] = Int64GetDatumFast(tmp.rows);
>>           values[i++] = Int64GetDatumFast(tmp.shared_blks_hit);
>>           values[i++] = Int64GetDatumFast(tmp.shared_blks_read);
>>
>> Previously (without the patch) pg_stat_statements_1_3() reported
>> the buffer usage counters updated only in execution phase. But,
>> in the patched version, pg_stat_statements_1_3() reports the total
>> of buffer usage counters updated in both planning and execution
>> phases. Is this OK? I'm not sure how seriously we should ensure
>> the backward compatibility for pg_stat_statements....
> 
> That's indeed a behavior change, although the new behavior is probably better
> as user want to know how much resource a query is consuming overall.  We could
> distinguish all buffers with a plan/exec version, but it seems quite overkill.

Ok.

> 
>> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
>>
>> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
>> the definition of pg_stat_statements() is changed. Thought?
> 
> I thought that since CreateExtension() was modified to be able to find it's way
> automatically, we shouldn't provide base version anymore, to minimize
> maintenance burden and also avoid possible bug/discrepancy.  The only drawback
> is that it'll do multiple CREATE or DROP/CREATE of the function usually once
> per database, which doesn't seem like a big problem.

Ok.

Here are other comments.

-        if (jstate)
+        if (kind == PGSS_JUMBLE)

Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.

If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?

+      <entry><structfield>total_time</structfield></entry>
+      <entry><type>double precision</type></entry>
+      <entry></entry>
+      <entry>
+        Total time spend planning and executing the statement, in milliseconds
+      </entry>
+     </row>

pg_stat_statements view has this column but the function not.
We should make both have the column or not at all, for consistency?
I'm not sure if it's good thing to expose the sum of total_plan_time
and total_exec_time as total_time. If some users want that, they can
easily calculate it from total_plan_time and total_exec_time by using
their own logic.

+        nested_level++;
+        PG_TRY();

In old thread [1], Tom Lane commented the usage of nested_level
in the planner hook. There seems no reply to that so far. What's
your opinion about that comment?

[1] https://www.postgresql.org/message-id/28980.1515803777@sss.pgh.pa.us

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: plan cache overhead on plpgsql expression
Next
From: Masahiko Sawada
Date:
Subject: Re: error context for vacuum to include block number