RE: Planning counters in pg_stat_statements (using pgss_store) - Mailing list pgsql-hackers

From legrand legrand
Subject RE: Planning counters in pg_stat_statements (using pgss_store)
Date
Msg-id DB6PR0301MB21351B7CDA68EFE4838562EC905C0@DB6PR0301MB2135.eurprd03.prod.outlook.com
Whole thread Raw
In response to Re: Planning counters in pg_stat_statements (using pgss_store)  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Planning counters in pg_stat_statements (using pgss_store)
List pgsql-hackers
> This patch has multiple trailing whitespace, indent and coding style
> issues.  You should consider running pg_indent before submitting a
> patch.  I attach the diff after running pgindent if you want more
> details about the various issues.

fixed


> - *     Track statement execution times across a whole database cluster.
> + *     Track statement planning and execution times across a whole cluster.

> if we're changing this, we should also fix the fact that's it's not
> tracking only the time but various resources?

fixed


> +       /* calc differences of buffer counters. */
> +       bufusage.shared_blks_hit =
> +           pgBufferUsage.shared_blks_hit - bufusage_start.shared_blks_hit;> >
> [...]

> This is an exact duplication of pgss_ProcessUtility(), it's probably
> better to create a macro or a function for that instead.

yes, maybe later (I don't know macros)


> +               pgss_store("",
> +                  parse->queryId,          /* signal that it's a
> utility stmt */
> +                  -1,

> the comment makes no sense, and also you can't pass an empty query
> string / unknown len.  There's no guarantee that the entry for the
> given queryId won't have been evicted, and in this case you'll create
> a new and unrelated entry.

Fixed, comment was wrong
Query text is not available in pgss_planner_hook
that's why pgss_store execution is forced in pgss_post_parse_analyze
(to initialize pgss entry with its query text).

There is a very small risk that query has been evicted between
pgss_post_parse_analyze and pgss_planner_hook.



> @@ -832,13 +931,13 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
>      * the normalized string would be the same as the query text anyway, so
>      * there's no need for an early entry.
>      */
> -   if (jstate.clocations_count > 0)
>         pgss_store(pstate->p_sourcetext,

> Why did you remove this?  pgss_store() isn't free, and asking to
> generate a normalized query for a query that doesn't have any constant
> or storing the entry early won't do anything useful AFAICT.  Though if
> that's useful, you definitely can't remove the test without adapting
> the comment and the indentation.

See explanation in previous answer (comments have been updated accordingly)


> there are 4 tests to check if planning_time is zero or not, it's quite
> messy.  Could you refactor the code to avoid so many tests?  It would
> probably be useful to add some asserts to check that we don't provide
> both planning_time == 0 and execution related values.  The function's
> comment would also need to be adapted to mention the new rationale
> with planning_time.

Fixed


>     * hash table entry for the PREPARE (with hash calculated from the query
>     * string), and then a different one with the same query string (but hash
>     * calculated from the query tree) would be used to accumulate costs of
> -    * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
> -    * cases where planning time is not included at all.
> +    * ensuing EXECUTEs.

> the comment about confusing behavior is still valid.

Fixed


>> Columns naming has not been modified, I would propose to change it to:
>>  - plans: ok
>>  - planning_time --> plan_time
>>  - calls: ok
>>  - total_time --> exec_time
>>  - {min,max,mean,stddev}_time: ok
>>  - new total_time (being the sum of plan_time and exec_time)

> plan_time and exec_time are accumulated counters, so we need to keep
> the total_ prefix in any case.  I think it's ok to break the function
> output names if we keep some kind of compatibility at the view level
> (which can keep total_time as the sum of total_plan_time and
> total_exec_time), so current queries against the view wouldn't break,
> and get what they probably wanted.

before to change this at all (view, function, code, doc) levels,
I would like to be sure that column names will be:
  - plans
  - total_plan_time
  - calls
  - total_exec_time
  - min_time (without exec in name)
  - max_time (without exec in name)
  - mean_time (without exec in name)
  - stddev_time (without exec in name)
  - total_time (being the sum of total_plan_time and total_exec_time)

could other users confirm ?


Attachment

pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: Fix XML handling with DOCTYPE
Next
From: Chapman Flack
Date:
Subject: Re: The two "XML Fixes" patches still in need of review