Re: [HACKERS] Planning counters in pg_stat_statements - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] Planning counters in pg_stat_statements
Date
Msg-id CAEepm=29UCJCwLQnNJfjFD_rLPFEmqA-dOFqGW5UdPG4gOuYoQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Planning counters in pg_stat_statements  (Dmitry Ivanov <d.ivanov@postgrespro.ru>)
Responses Re: [HACKERS] Planning counters in pg_stat_statements  (Dmitry Ivanov <d.ivanov@postgrespro.ru>)
List pgsql-hackers
Hi Dmitry,

On Tue, May 15, 2018 at 11:10 PM, Dmitry Ivanov <d.ivanov@postgrespro.ru> wrote:
> Is anybody still working on this? Are there any plans to add this to
> commitfest?

I am not actively working on this now, but I'll come back to it for
PG12 if you or Lukas don't beat me to it, and I'll help/test/review if
I you do.  It seems there is plenty of demand for the feature and I'll
be very happy to see it.

> I'd like to add planning time to auto_explain, and it turns out that this
> patch is somewhat relevant to that feature.

Sounds very useful.

> The current approach here is to set planning_time in PlannedStmt via
> planner_hook, which (in my opinion) has several flaws:
>
> 1. Depending on the order of extensions in shared_preload_libraries, it
> might not measure time spent on preceding planner hooks.
>
> 2. Provided that there are multiple users of this metric, it might become a
> little too costy to register several hooks with identical purpose.
>
> 3. [Bikeshedding] Although planning time is stored in PlannedStmt, it's
> definitely not an inherent property of a plan. You could have two machines
> with identical settings but quite different planning times due to various
> circumstances (raw CPU power, I/O etc).

Yeah.  Putting that inside the PlannedStmt wasn't very nice.  Another
thing I tried was to have some opaque extension workspace, but that
runs into a number of technical problem.  That idea is definitely
dead.

> I'd argue that it might be better to add a new argument to pg_plan_query()
> and pg_plan_queries() and a new field to QueryDesc, i.e.:
>
> PlannedStmt *
> pg_plan_query(Query *querytree,
>                           int cursorOptions,
>                           ParamListInfo boundParams,
>                           double *planningTime)
>
> List *
> pg_plan_queries(List *querytrees,
>                                 int cursorOptions,
>                                 ParamListInfo boundParams,
>                                 double *planningTime) /* total time as in
> BuildCachedPlan() */
>
> The measured time can later be passed to QueryDesc via PortalDefineQuery().
> Of course, this requires more changes, but the result might be worth it.
>
> What do you think?

So who does the actual timing work in this model?  Does the core code
do the timing, but it'd be disabled by default because NULL is usually
passed in, and you need to register an extension that provides a place
to stick the result in order to turn on the time-measuring code?  If
you mean that it's still the extension that does the timing, it seems
strange to have struct members and parameter arguments for something
specific that the core code doesn't understand.

As a more general version of that, I wondered about having some kind
of associative data structure (hash table, assoc list, etc) that would
somehow travel everywhere with the PlannedStmt, but not inside it,
just for the use of extensions.  Then planning time could be stored in
there by a planner hook, and the fished out by any other hook that
knows the same key (not sure how you manage that key space but I'm
sure you could come up with something).  That could have uses for
other extensions too, and could also be used for the "query ID", which
is, after all, the model that the abandoned planning_time member was
following.  Just a thought.

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Postgres 11 release notes
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: parallel foreign scan