Re: On disable_cost - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: On disable_cost
Date
Msg-id 591b3596-2ea0-4b8e-99c6-fad0ef2801f5@iki.fi
Whole thread Raw
In response to Re: On disable_cost  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: On disable_cost
List pgsql-hackers
On 28/06/2024 18:46, Robert Haas wrote:
> On Wed, Jun 12, 2024 at 11:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> Well, that didn't generate much discussion, but here I am trying
>> again. Here I've got patches 0001 and 0002 from my previous posting;
>> I've dropped 0003 and 0004 from the previous set for now so as not to
>> distract from the main event, but they may still be a good idea.
>> Instead I've got an 0003 and an 0004 that implement the "count of
>> disabled nodes" approach that we have discussed previously. This seems
>> to work fine, unlike the approaches I tried earlier. I think this is
>> the right direction to go, but I'd like to know what concerns people
>> might have.
> 
> Here is a rebased patch set, where I also fixed pgindent damage and a
> couple of small oversights in 0004.
> 
> I am hoping to get these committed some time in July. So if somebody
> thinks that's too soon or thinks it shouldn't happen at all, please
> don't wait too long to let me know about that.

v3-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patch:

+1, this seems ready for commit

v3-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patch:

I fear this will break people's applications, if they are currently 
forcing a sequential scan with "set enable_indexscan=off". Now they will 
need to do "set enable_indexscan=off; set enable_indexonlyscan=off" for 
the same effect. Maybe it's acceptable, disabling sequential scans to 
force an index scan is much more common than the other way round.

v3-0003-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patch:

> @@ -1318,6 +1342,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
>      startup_cost += path->pathtarget->cost.startup;
>      run_cost += path->pathtarget->cost.per_tuple * path->rows;
>  
> +    /*
> +     * There are assertions above verifying that we only reach this function
> +     * either when enable_tidscan=true or when the TID scan is the only legal
> +     * path, so it's safe to set disabled_nodes to zero here.
> +     */
> +    path->disabled_nodes = 0;
>      path->startup_cost = startup_cost;
>      path->total_cost = startup_cost + run_cost;
>  }

So if you have enable_tidscan=off, and have a query with "WHERE CURRENT 
OF foo" that is planned with a TID scan, we set disable_nodes = 0? That 
sounds wrong, shouldn't disable_nodes be 1 in that case? It probably 
cannot affect the rest of the plan, given that "WHERE CURRENT OF" is 
only valid in an UPDATE or DELETE, but still. At least it deserves a 
better explanation in the comment.

> diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
> index 6b64c4a362..20236e8c4d 100644
> --- a/src/backend/optimizer/plan/createplan.c
> +++ b/src/backend/optimizer/plan/createplan.c
> @@ -25,6 +25,7 @@
>  #include "nodes/extensible.h"
>  #include "nodes/makefuncs.h"
>  #include "nodes/nodeFuncs.h"
> +#include "nodes/print.h"
>  #include "optimizer/clauses.h"
>  #include "optimizer/cost.h"
>  #include "optimizer/optimizer.h"

left over from debugging?

> @@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
>  int
>  compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
>  {
> +    /* Number of disabled nodes, if different, trumps all else. */
> +    if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
> +    {
> +        if (path1->disabled_nodes < path2->disabled_nodes)
> +            return -1;
> +        else
> +            return +1;
> +    }
> +
>      if (criterion == STARTUP_COST)
>      {
>          if (path1->startup_cost < path2->startup_cost)

Is "unlikely()" really appropriate here (and elsewhere in the patch)? If 
you run with enable_seqscan=off but have no indexes, you could take that 
path pretty often.

If this function needs optimizing, I'd suggest splitting it into two 
functions, one for comparing the startup cost and another for the total 
cost. Almost all callers pass a constant for that argument, so they 
might as well call the correct function directly and avoid the branch 
for that.

> @@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
>          Path       *old_path = (Path *) lfirst(p1);
>          PathKeysComparison keyscmp;
>  
> +        /*
> +         * Since the pathlist is sorted by disabled_nodes and then by
> +         * total_cost, we can stop looking once we reach a path with more
> +         * disabled nodes, or the same number of disabled nodes plus a
> +         * total_cost larger than the new path's.
> +         */
> +        if (unlikely(old_path->disabled_nodes != disabled_nodes))
> +        {
> +            if (disabled_nodes < old_path->disabled_nodes)
> +                break;
> +        }
> +        else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
> +            break;
> +
>          /*
>           * We are looking for an old_path with the same parameterization (and
>           * by assumption the same rowcount) that dominates the new path on
> @@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
>           *
>           * Cost comparisons here should match compare_path_costs_fuzzily.
>           */
> -        if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
> +        /* new path can win on startup cost only if consider_startup */
> +        if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
> +            !consider_startup)
>          {

The "Cost comparisons here should match compare_path_costs_fuzzily" 
comment also applies to the check on total_cost that you moved up. Maybe 
move up the comment to the beginning of the loop.

v3-0004-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patch:

It's surprising that the "Disable Nodes" is printed even with the COSTS 
OFF option. It's handy for our regression tests, it's good to print them 
there, but it feels wrong.

Could we cram it into the "cost=... rows=..." part? And perhaps a marker 
that a node was disabled would be more user friendly than showing the 
cumulative count? Something like:

postgres=# set enable_material=off;
SET
postgres=# set enable_seqscan=off;
SET
postgres=# set enable_bitmapscan=off;
SET
postgres=# explain select * from foo, bar;
                                      QUERY PLAN 

------------------------------------------------------------------------------------
  Nested Loop  (cost=0.15..155632.40 rows=6502500 width=8)
    ->  Index Only Scan using foo_i_idx on foo  (cost=0.15..82.41 
rows=2550 width=4)
    ->  Seq Scan on bar  (cost=0.00..35.50 (disabled) rows=2550 width=4)
(5 rows)

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: LogwrtResult contended spinlock
Next
From: Andrew Dunstan
Date:
Subject: Re: Cleaning up perl code