Re: Re: fix cost subqueryscan wrong parallel cost - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Re: fix cost subqueryscan wrong parallel cost
Date
Msg-id CAKFQuwaoO7susk1HeLUaKgxmWwRm=oYJUaWaGmOrBMXMn65gAQ@mail.gmail.com
Whole thread Raw
In response to Re: Re: fix cost subqueryscan wrong parallel cost  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Apr 28, 2022 at 9:53 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Apr 22, 2022 at 11:55 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Wed, Apr 20, 2022 at 11:38 PM bucoo@sohu.com <bucoo@sohu.com> wrote:
>>
>> > > for now fuction cost_subqueryscan always using *total* rows even parallel
>> > > path. like this:
>> > >
>> > > Gather (rows=30000)
>> > >   Workers Planned: 2
>> > >   ->  Subquery Scan  (rows=30000) -- *total* rows, should be equal subpath
>> > >         ->  Parallel Seq Scan  (rows=10000)
>> >
>> > OK, that's bad.
>

Gather doesn't require a parallel aware subpath, just a parallel-safe
subpath. In a case like this, the parallel seq scan will divide the
rows from the underlying relation across the three processes executing
it. Each process will pass the rows it receives through its own copy
of the subquery scan. Then, the Gather node will collect all the rows
from all the workers to produce the final result.


Thank you.  I think I got off on a tangent there and do understand the general design here better now.

I feel like the fact that the 2.4 divisor (for 2 planned workers) isn't shown in the explain plan anywhere is an oversight.

To move the original complaint forward a bit I am posting the three plan changes that using path->subpath->rows provokes in the regression tests.

======================================================================
 --
 -- Check for incorrect optimization when IN subquery contains a SRF
 --
 select * from int4_tbl o where (f1, f1) in
   (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1);

The material difference between the existing plan and this one is the estimation of 250 rows
here compared to 1 row.
So (rel.rows != path->subpath->rows) at the top of cost_subqueryscan
+               ->  Subquery Scan on "ANY_subquery"  (cost=1.06..9.28 rows=250 width=8)
+                     Output: "ANY_subquery".f1, "ANY_subquery".g
+                     Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+                     ->  Result  (cost=1.06..6.15 rows=250 width=8)
======================================================================
The second plan change is basically this same thing, going from rows=4 to rows=1
causes the plan to include a materialize node.  The shape for purposes of the security barrier
remains correct.
======================================================================
select * from t union select * from t order by 1,3;
Gather here costs 2,600 vs the Append being 2,950 in the existing plan shape.
+               ->  Gather  (cost=0.00..2600.00 rows=120000 width=12)
+                     Workers Planned: 2
+                     ->  Parallel Append  (cost=0.00..2600.00 rows=50000 width=12)
+                           ->  Parallel Seq Scan on t  (cost=0.00..575.00 rows=25000 width=12)
+                           ->  Parallel Seq Scan on t t_1  (cost=0.00..575.00 rows=25000 width=12)
=======================================================================

I've attached the two raw regression output diffs.

Using path->subpath->rows ignores the impact of the node's own filters, but the base pre-filter number is/should be the correct one; though it is difficult to say that with certainty when most of these nodes are discarded and one cannot debug in the middle but only observe the end results.  Disabling that optimization is presently beyond my skill though I may take it up anyway as its likely still orders easier to do, and then hope some of these plans produce using data to check with, than actually diving into a C debugger for the first time.

Reverse engineering the 350 difference may be another approach - namely is it strictly due to the different plan shape or is it due to the number of rows.  The fact that the row difference is 35,000 and the cost is 1% (cpu_tuple_cost = 0.01) of that number seems like a red herring after thinking it through...to many scans plus the differing shapes.

David J.

Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Missing can't-assign-to-constant checks in plpgsql
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply