Re: On disable_cost - Mailing list pgsql-hackers

From David Rowley
Subject Re: On disable_cost
Date
Msg-id CAApHDvoND8j8Pg6OuhqhDkaOmw5cEfQUUN0xf5ErxX=br+hz_A@mail.gmail.com
Whole thread Raw
In response to Re: On disable_cost  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: On disable_cost
Re: On disable_cost
List pgsql-hackers
On Fri, 4 Oct 2024 at 02:15, Robert Haas <robertmhaas@gmail.com> wrote:
> robert.haas=# explain with recursive foo as (select count(*) from
> pgbench_accounts union all select aid from pgbench_accounts a, foo
> where aid > foo.count) select * from pgbench_accounts, foo where aid =
> foo.count;

> You might expect that the number of disabled nodes for the hash join
> would include the number for the CTE attached to it, but it doesn't. I
> suspect similar things will happen when a node has an InitPlan or
> SubPlan node attached to it. (I have not tested whether your patch
> gets these cases right.)

It looks fine with the patch. The crux of the new logic is just
summing up the disabled_nodes from the child nodes and checking if the
disabled_nodes of the current node is higher than that sum. That's not
exactly hard logic. The biggest risk seems to be not correctly
visiting all the child nodes. I got that wrong with SubqueryScan in my
first PoC.

> For sure. To be clear, it's not that I love the current output. It's
> that I'm worried that it will be hard to get the thing that you and
> Laurenz want to be fully reliable, and I think there's a chance that
> not only might it contain bugs now, but it might turn out that people
> changing logic in this area in the future introduce more bugs.
> plan_is_disabled() has to get exactly the correct answer for the child
> nodes every time, or the answer is wrong, and I'm not as confident as
> you are that your logic is fully correct (which doesn't mean that I
> can prove to you that it is incorrect; I don't even know that it is).

One thing the patch did cause me to find is the missing propagation of
disabled_nodes in make_sort(). It was very obviously wrong with the
patched EXPLAIN output and wasn't so obvious with the current output,
so perhaps you could look at this patch as a better way of ensuring
the disable_node propagation is correct. That's much harder logic to
get right than what I've added to explain.c as it's spread out in many
places.

Take this case, for example:

create table lp (a int) partition by list(a);
create table lp1 partition of lp for values in(1);
create table lp2 partition of lp for values in(2);
create index on lp1(a);
insert into lp select 1 from generate_Series(1,1000000);
analyze lp;
set enable_sort=0;
explain (costs off) select * from lp order by a;

master gives:

 Append
   Disabled Nodes: 1
   ->  Index Only Scan using lp1_a_idx on lp1 lp_1
   ->  Sort
         Sort Key: lp_2.a
         ->  Seq Scan on lp2 lp_2

which isn't correct. Append appears disabled, but it's not. Sort is.
Before I fixed that in the patch, I was incorrectly getting the
"Disabled: true" under the Append. I feel we're more likely to get bug
reports alerting us to incorrect logic when the disabled property only
appears on disabled nodes as there are far fewer of them to look at
and therefore it's more obvious when they're misplaced.

The patched version correctly gives us:

 Append
   ->  Index Only Scan using lp1_a_idx on lp1 lp_1
   ->  Sort
         Disabled: true
         Sort Key: lp_2.a
         ->  Seq Scan on lp2 lp_2

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Should rolpassword be toastable?
Next
From: Nathan Bossart
Date:
Subject: Re: Should rolpassword be toastable?