Re: Problem about postponing gathering partial paths for topmost scan/join rel - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Problem about postponing gathering partial paths for topmost scan/join rel
Date
Msg-id CAMbWs49pyveUf81b4prmobCrjL5yoCHJTYf1ey88PkruzRzL7Q@mail.gmail.com
Whole thread Raw
In response to Re: Problem about postponing gathering partial paths for topmost scan/join rel  (Antonin Houska <ah@cybertec.at>)
Responses Re: Problem about postponing gathering partial paths for topmost scan/join rel
List pgsql-hackers

On Thu, Jul 14, 2022 at 10:02 PM Antonin Houska <ah@cybertec.at> wrote:
I'd prefer a test that demonstrates that the Gather node at the top of the
"subproblem plan" is useful purely from the *cost* perspective, rather than
due to executor limitation.

This patch provides an additional path (Gather atop of subproblem) which
was not available before. But your concern makes sense that we need to
show this new path is valuable from competing on cost with other paths.

How about we change to Nested Loop at the topmost? Something like:

set join_collapse_limit to 2;

# explain (costs off) select * from foo a join foo b on a.i = b.i join foo c on b.i > c.i;
                     QUERY PLAN
----------------------------------------------------
 Nested Loop
   Join Filter: (b.i > c.i)
   ->  Gather
         Workers Planned: 4
         ->  Parallel Hash Join
               Hash Cond: (a.i = b.i)
               ->  Parallel Seq Scan on foo a
               ->  Parallel Hash
                     ->  Parallel Seq Scan on foo b
   ->  Materialize
         ->  Gather
               Workers Planned: 4
               ->  Parallel Seq Scan on foo c
(13 rows)

Without the patch, the path which is Gather atop of subproblem is not
available, and we would get:

# explain (costs off) select * from foo a join foo b on a.i = b.i join foo c on b.i > c.i;
                     QUERY PLAN
----------------------------------------------------
 Nested Loop
   Join Filter: (b.i > c.i)
   ->  Hash Join
         Hash Cond: (a.i = b.i)
         ->  Gather
               Workers Planned: 4
               ->  Parallel Seq Scan on foo a
         ->  Hash
               ->  Gather
                     Workers Planned: 4
                     ->  Parallel Seq Scan on foo b
   ->  Materialize
         ->  Gather
               Workers Planned: 4
               ->  Parallel Seq Scan on foo c
(15 rows)

Thanks
Richard 

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Next
From: Fujii Masao
Date:
Subject: Re: Add function to return backup_label and tablespace_map