Re: On disable_cost - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: On disable_cost
Date
Msg-id CAKFQuwY9Emt4P2jU6KWiK=xaMp_JQvyS4PJG6BTUHWhwkD3+KQ@mail.gmail.com
Whole thread Raw
In response to Re: On disable_cost  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Fri, Oct 4, 2024 at 10:37 PM David Rowley <dgrowleyml@gmail.com> wrote:
I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.


 As a DBA when I set one or more of the enable_* settings to false, and explain a query, I need to know:

1, that the plan shown to me is constrained,
2, which constraints are in place, and
3, which constraints were violated.

The Settings option to Explain fulfills my second need.  It is not a precise solution nor is it automatic.  Because of these two things it really doesn't qualify as fulfilling the first need.

To fulfill the first need I would want to see a data block containing the following information:
How many (>= 1) enable_* settings are set to false.  This is the bare requirement, but we can also include a count of how many violations exist, thus aggregating the count of the third need. This information is not specific to any node and thus should be displayed outside of the execution tree, the specific choice consistent with the output format under consideration.

The detail for the third need, violations, is tied to specific executor nodes.  The information provided here should inform me as to which specific setting was violated as well as, if possible, why. This is effectively three pieces of information:  "Disabled: * (footnote)"  The word disabled is the indicator that this node type was requested to not be included in the query plan.  The * tells me exactly which of the disabled settings is at play here, reducing the cognitive burden of memorizing node types to settings.  The footnote would be a reference into the documentation under the enable_* setting that explains why this node is appearing in the query plan even though I explicitly asked for it to be excluded.  In a verbose output (add a new violations option for this) it would even be possible to save the trip to the documentation by adding the footnote text to the explain output.

Now, existing proposals include another piece of data - for every node calculate how many violations occur in its tree (inclusive).  I'm not understanding the motivation for this data.  Knowing which nodes are violations seems like it is enough.  I could always count, and processing tools could add this aggregate to their displays, but the count itself only seems useful at the scope of the entire query plan.  And possibly sub-plans.

So, by way of example:

set enable_sort=0;
explain (costs off, settings, violations) select * from lp order by a;

 Append
   ->  Index Only Scan using lp1_a_idx on lp1 lp_1
   ->  Sort
         Disabled: Sort (A)
         Sort Key: lp_2.a
         ->  Seq Scan on lp2 lp_2
         
Disabled Planner Settings: 1
Disabled Node Violations: 1
Settings:
     ...
    enable_sort = false
....
Violation Reasons:
Sort (A): The query contains an order by clause over data coming from a table being sequentially scanned.  This scan's output must be sorted to fulfill the order by requirement.

I was considering doing a layout like:

Sort (disabled_sort.A) (cost...) (actual...)

but having its own line on those nodes having the violation seems reasonable.  It should be noticeable when the violations occur and this does stand out.  The less pronounced format would be more appropriate for the "Disabled: #" display that would appear on every single node; which I believe is counter-productive.  Only marking the violation provides the same amount of detail and allows for the computation of those counts should the need arise.  As a DBA, though, I do not know how to use that count in a meaningful way.

In text format we place additional information at the bottom of the query result.  It is worth considering whether to place information before the planner tree.  If that is acceptable the two "Disabled Planner %:" counts should be moved to before the node tree.  This draws immediate attention to the explain output consumer that this plan is constrained and that other options, like settings and violations, should be added to the explain command to show additional details.  But the two counts and the node detail "Disabled: * (footnote)" will always be visible.

The footnote definitely is its own feature added to increase usability.  I'm expecting it to not be accepted given the current design of explain, and also it seems quite difficult to get good data out of the planner to make the display accurate.  But if we tell someone that a setting they disable is violated they are going to wonder why.

David J.


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: On disable_cost
Next
From: Peter Geoghegan
Date:
Subject: Re: POC, WIP: OR-clause support for indexes