Re: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Planning counters in pg_stat_statements (using pgss_store)
Date
Msg-id CAOBaU_Yq6motvaqXE_vBWf=SsRANij=+jE3Xa4qCmzGb24LkzA@mail.gmail.com
Whole thread Raw
In response to RE: Planning counters in pg_stat_statements (using pgss_store)  (legrand legrand <legrand_legrand@hotmail.com>)
Responses RE: Planning counters in pg_stat_statements (using pgss_store)
List pgsql-hackers
On Fri, Mar 22, 2019 at 11:46 PM legrand legrand
<legrand_legrand@hotmail.com> wrote:
>
> Here is a rebased and corrected version .

This patch has multiple trailing whitespace, indent and coding style
issues.  You should consider running pg_indent before submitting a
patch.  I attach the diff after running pgindent if you want more
details about the various issues.

- *     Track statement execution times across a whole database cluster.
+ *     Track statement planning and execution times across a whole cluster.

if we're changing this, we should also fix the fact that's it's not
tracking only the time but various resources?

+       /* calc differences of buffer counters. */
+       bufusage.shared_blks_hit =
+           pgBufferUsage.shared_blks_hit - bufusage_start.shared_blks_hit;
[...]

This is an exact duplication of pgss_ProcessUtility(), it's probably
better to create a macro or a function for that instead.

+               pgss_store("",
+                  parse->queryId,          /* signal that it's a
utility stmt */
+                  -1,

the comment makes no sense, and also you can't pass an empty query
string / unknown len.  There's no guarantee that the entry for the
given queryId won't have been evicted, and in this case you'll create
a new and unrelated entry.

@@ -832,13 +931,13 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
     * the normalized string would be the same as the query text anyway, so
     * there's no need for an early entry.
     */
-   if (jstate.clocations_count > 0)
        pgss_store(pstate->p_sourcetext,

Why did you remove this?  pgss_store() isn't free, and asking to
generate a normalized query for a query that doesn't have any constant
or storing the entry early won't do anything useful AFAICT.  Though if
that's useful, you definitely can't remove the test without adapting
the comment and the indentation.

@@ -1249,15 +1351,19 @@ pgss_store(const char *query, uint64 queryId,
        if (e->counters.calls == 0)
            e->counters.usage = USAGE_INIT;

-       e->counters.calls += 1;
-       e->counters.total_time += total_time;
-       if (e->counters.calls == 1)
+       if (planning_time == 0)
+       {
+           e->counters.calls += 1;
+           e->counters.total_time += total_time;
+       }
+
+       if (e->counters.calls == 1 && planning_time == 0)
        {
            e->counters.min_time = total_time;
            e->counters.max_time = total_time;
            e->counters.mean_time = total_time;
        }
-       else
+       else if(planning_time == 0)
        {
            /*
             * Welford's method for accurately computing variance. See
@@ -1276,6 +1382,9 @@ pgss_store(const char *query, uint64 queryId,
            if (e->counters.max_time < total_time)
                e->counters.max_time = total_time;
        }
+       if (planning_time > 0)
+           e->counters.plans += 1;
+       e->counters.planning_time += planning_time;

there are 4 tests to check if planning_time is zero or not, it's quite
messy.  Could you refactor the code to avoid so many tests?  It would
probably be useful to add some asserts to check that we don't provide
both planning_time == 0 and execution related values.  The function's
comment would also need to be adapted to mention the new rationale
with planning_time.

     * hash table entry for the PREPARE (with hash calculated from the query
     * string), and then a different one with the same query string (but hash
     * calculated from the query tree) would be used to accumulate costs of
-    * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
-    * cases where planning time is not included at all.
+    * ensuing EXECUTEs.

the comment about confusing behavior is still valid.

>
> Columns naming has not been modified, I would propose to change it to:
>  - plans: ok
>  - planning_time --> plan_time
>  - calls: ok
>  - total_time --> exec_time
>  - {min,max,mean,stddev}_time: ok
>  - new total_time (being the sum of plan_time and exec_time)

plan_time and exec_time are accumulated counters, so we need to keep
the total_ prefix in any case.  I think it's ok to break the function
output names if we keep some kind of compatibility at the view level
(which can keep total_time as the sum of total_plan_time and
total_exec_time), so current queries against the view wouldn't break,
and get what they probably wanted.

Attachment

pgsql-hackers by date:

Previous
From: Fabrízio de Royes Mello
Date:
Subject: Re: Introduce MIN/MAX aggregate functions to pg_lsn
Next
From: Fabrízio de Royes Mello
Date:
Subject: Lack of new line between IF statements