Re: fix cost subqueryscan wrong parallel cost - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: fix cost subqueryscan wrong parallel cost
Date
Msg-id CAKFQuwZtpka_aVE8d4-DT=twRSg1cGtdun5wddZ4Cpv9O+zxXA@mail.gmail.com
Whole thread Raw
In response to Re: fix cost subqueryscan wrong parallel cost  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Apr 29, 2022 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> Currently subquery scan is using rel->rows (if no parameterization),
> which I believe is not correct. That's not the size the subquery scan
> node in each worker needs to handle, as the rows have been divided
> across workers by the parallel-aware node.

Really?  Maybe I misunderstand the case under consideration, but
what I think will be happening is that each worker will re-execute
the pushed-down subquery in full.  Otherwise it can't compute the
correct answer.  What gets divided across the set of workers is
the total *number of executions* of the subquery, which should be
independent of the number of workers, so that the cost is (more
or less) the same as the non-parallel case.


I've been operating under the belief that a subquery node (or, rather, all nodes) in a path get their row count estimates from self-selectivity * rows provided by the next node down in the path.  In a partial path, eventually the parallel aware node at the bottom of the path divides its row count estimate by the parallel divisor (2.4 for two workers).  That value then returns back up through the path until it hits a gather.  Every node in between, which are almost exclusively parallel-safe (not parallel aware), can just use the row count being percolated up the path (i.e., they get the divided row count in their partial pathing and full row counts in the normal pathing).  The gather node, realizing that the row count it is dealing with has been divided by the parallel divisor, undoes the division in order to provide the correct row count to the non-parallel executing nodes above it.

So a subquery is only ever executed once in a path - but the number of returned rows depends on the number of planned workers done under the assumption that a query executed among 2 workers costs 1/2.4 the amount of the same query done with just the leader.  It is an independent factor compared to everything else going on and so the multiplication can simply be done first to the original row count.

<starts looking for confirmation in the code>

allpaths.c@2974-2975 (generate_gather_paths)
(L: 2993 as well)
(L: 3167 - generate_useful_gather_paths)

rows =
cheapest_partial_path->rows * cheapest_partial_path->parallel_workers;

Shouldn't that be:

rows =
cheapest_partial_path->rows * (get_parallel_divisor(cheapest_partial_path))

?

David J.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: fix cost subqueryscan wrong parallel cost
Next
From: Tom Lane
Date:
Subject: Re: fix cost subqueryscan wrong parallel cost