Re: Re: fix cost subqueryscan wrong parallel cost - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Re: fix cost subqueryscan wrong parallel cost
Date
Msg-id CAMbWs4_QVQXaTZsUYUdqm8dumCsrDdiSF5Oatg_m7wdrZ8tWZQ@mail.gmail.com
Whole thread Raw
In response to Re: Re: fix cost subqueryscan wrong parallel cost  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: fix cost subqueryscan wrong parallel cost  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

On Fri, Apr 29, 2022 at 12:53 AM Robert Haas <robertmhaas@gmail.com> wrote:
Gather doesn't require a parallel aware subpath, just a parallel-safe
subpath. In a case like this, the parallel seq scan will divide the
rows from the underlying relation across the three processes executing
it. Each process will pass the rows it receives through its own copy
of the subquery scan. Then, the Gather node will collect all the rows
from all the workers to produce the final result.

It's an extremely important feature of parallel query that the
parallel-aware node doesn't have to be immediately beneath the Gather.
You need to have a parallel-aware node in there someplace, but it
could be separated from the gather by any number of levels e.g.

Gather
-> Nested Loop
  -> Nested Loop
    -> Nested Loop
       -> Parallel Seq Scan
       -> Index Scan
     -> Index Scan
   -> Index Scan

Thanks for the explanation. That's really helpful to understand the
parallel query mechanism.

So for the nodes between Gather and parallel-aware node, how should we
calculate their estimated rows?

Currently subquery scan is using rel->rows (if no parameterization),
which I believe is not correct. That's not the size the subquery scan
node in each worker needs to handle, as the rows have been divided
across workers by the parallel-aware node.

Using subpath->rows is not correct either, as subquery scan node may
have quals.

It seems to me the right way is to divide the rel->rows among all the
workers.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: SQL JSON compliance
Next
From: Peter Smith
Date:
Subject: Re: Multi-Master Logical Replication