> This proposal introduces a feature to print execution plans of active
> queries in an in-memory shared hash object so that other sessions can
> visualize them with a new view: pg_stat_progress_explain.
Thanks for this thread and for sharing the presentation
material. +1 for the idea of adding instrumentation that
will help users understand the bottlenecks in execution
plans. I want to share my perspective on this topic.
A DBA will want to know:
1/ Where is the bottleneck for a long running query currently
in flight?
2/ For a OLTP workload with many quick plans that
could be further optimized; what plan and what
part of the plan is contributing to the database load?
Having a view like pg_stat_progress_explain ( maybe a more
appropriate name is pg_stat_progress_plan ) will be
extremely useful to allow a user to build monitoring
dashboards to be able to answer such questions.
I do not think however this instrumentation should only be
made available if a user runs EXPLAIN ANALYZE.
In my opinion, this will severely limit the usefulness of this
instrumentation in production. Of course, one can use auto_explain,
but users will be hesitant to enable auto_explain with analyze in
production for all their workloads. Also, there should not be an
auto_explain dependency for this feature.
One approach will be for the view to expose the
explain plan and the current node being executed. I think the
plan_node_id can be exposed for this purpose but have not looked
into this in much detail yet. The plan_node_id can then be used
to locate the part of the plan that is a potential bottleneck ( if that
plan node is the one constantly being called ).
This may also be work that is better suited for an extension, but
core will need to add a hook in ExecProcNode so an extension can
have access to PlanState.
Regards,
Sami Imseih
Amazon Web Services (AWS)