Re: Proposal: Progressive explain - Mailing list pgsql-hackers

From Rafael Thofehrn Castro
Subject Re: Proposal: Progressive explain
Date
Msg-id CAG0ozMpwRQFNraWLGVa0zAFBYuY1HasZ6NX4BAM4H3OOrQHtDg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Progressive explain  ("Euler Taveira" <euler@eulerto.com>)
Responses Re: Proposal: Progressive explain
List pgsql-hackers
Thanks for the valuable inputs Euler. Adjusted most of your recommendations.

> I found a crash. It is simple to reproduce.

Indeed, I failed to test plain EXPLAIN after the addition of the new GUC
progressive_explain_min_duration. This is fixed.

> You call this feature "progressive explain". My first impression is that it
> will only provide the execution plans for EXPLAIN commands. Instead of
> "progressive explain", I would suggest "query progress" that is a general
> database terminology. It seems natural to use "progressive explain" since you
> are using the explain infrastructure (including the same options -- format,
> settings, wal, ...) -- to print the execution plan.

Makes sense. Kept progressive explain for now but this is still open for
discussion.

> There is no use for the function argument. If you decide to keep this function,
remove it.

Done.

> Why don't you use the pgstat_progress_XXX() API? Since you are using a
> pg_stat_progress_XXX view name I would expect using the command progress
> reporting infrastructure (see backend_progress.c).

I haven't changed that part as of now. My implementation and underlying data
structure may not work well with that API, but I am investigating.

> Maybe you could include datid and datname as the other progress reporting
> views. It would avoid a join to figure out what the database is.

Done.

> Isn't it the same definition as in auto_explain.c? Use only one definition for
> auto_explain and this feature. You can put this struct into explain.c, use an
> extern definition for guc_tables.c and put a extern PGDLLIMPORT defintion into
> guc.h. See wal_level_options, for an example.

Done.

> The "analyze" is a separate option in auto_explain. Should we have 2 options?
> One that enable/disable this feature and another one that enable/disable
> analyze option.

Tomas Vondra proposed the current logic and I think it makes sense. Having a
single GUC to control the execution behavior keeps the feature simpler IMO.

> Don't the other EXPLAIN options make sense here? Like serialize and summary.

I added a missing GUC for option COSTS (progressive_explain_costs). Adding 
the other ones doesn't make much sense IMO. SUMMARY, SERIALIZE and MEMORY 
are information added at the end of the query execution (or plan creation for plain
EXPLAIN) in the summary section but at that point the progressive explain will be
already finished, with no more information in pg_stat_progress_explain.


> TupleDesc   tupDesc;        /* descriptor for result tuples */
> EState     *estate;         /* executor's query-wide state */
> PlanState  *planstate;      /* tree of per-plan-node state */
> +   struct ExplainState *pe_es; /* progressive explain state if enabled */

> Should you use the same name pattern here? pestate, for example.

Done.

> PG_LWLOCK(52, SerialControl)
> +PG_LWLOCK(53, ExplainHash)

> Could you use a specific name? Even if you keep the proposed name, you should
> use ProgressiveExplainHash, ProgressiveExplain or QueryProgress.

Changed to ProgressiveExplainHash.

> You don't need progressiveExplainHashKey. Use pid as key directly.

Done.

> I don't think last_print is accurate because it is not the time the execution plan
> is printed but the time it was updated. I suggest last_updated_time.

Changed from last_print to last_update. This is still open for discussion.

> I'm wondering why you use "array" in the name. ProgressiveExplainHash is a
> better name.

Used to be compatible with the ProcArray (that is also a hash). But what you
proposed is better indeed. Changed.

> I think you need a better name for PROGRESSIVE_EXPLAIN_ALLOC_SIZE because it
> doesn't reflect what it is. PROGRESSIVE_EXPLAIN_FREE_SIZE or
> PROGRESSIVE_EXPLAIN_AVAIL_SIZE?

Changed to PROGRESSIVE_EXPLAIN_FREE_SIZE.

Fixed the wrong function names in the comments and changed the format of those
comments in function headers to be comptible with other functions in explain.c.

> +   /* state related to progressive explains */
> +   struct PlanState *pe_curr_node;
> +   struct Instrumentation *pe_local_instr;
> +   dsa_area   *pe_a;

> Could you add some comments saying what each of these variables are for?

Done.

> I didn't experiment but I was wondering if there is a way to avoid the
> duplicates that you added to avoid the overhead.

You mean the local instrumentation object reused for each node?

Rafael.
Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Next
From: Christoph Berg
Date:
Subject: Re: Available disk space per tablespace