Thread: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

Hi all,

   I recently run benchmark[1] on master, but I found performance problem as below:

explain analyze select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
        ref_0.l_shipmode as c0,
        sample_0.l_orderkey as c1,
        sample_0.l_quantity as c2,
        ref_0.l_orderkey as c3,
        sample_0.l_shipmode as c5,
        ref_0.l_shipinstruct as c6
      from
        public.lineitem as ref_0
          left join public.lineitem as sample_0
          on ((select p_partkey from public.part order by p_partkey limit 1)
                 is not NULL)
      where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual time=299695.097..299695.099 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual time=0.651..0.652 rows=1 loops=1)
           ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual time=0.650..0.651 rows=1 loops=1)
                 Sort Key: part.p_partkey
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8) (actual time=0.013..0.428 rows=2000 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27) (actual time=299695.096..299695.096 rows=0 loops=1)
         Join Filter: ($0 IS NOT NULL)
         Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL))
         Rows Removed by Filter: 3621030625
         ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175 width=11) (actual time=0.026..6.225 rows=60175 loops=1)
         ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.554 rows=60175 loops=60175)
               ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
 Planning Time: 0.172 ms
 Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze  select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
        ref_0.l_shipmode as c0,
        sample_0.l_orderkey as c1,
        sample_0.l_quantity as c2,
        ref_0.l_orderkey as c3,
        sample_0.l_shipmode as c5,
        ref_0.l_shipinstruct as c6
      from
        public.lineitem as ref_0
          left join public.lineitem as sample_0
          on ((select p_partkey from public.part order by p_partkey limit 1)
                 is not NULL)
      where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
                                                                  QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1078.00..91026185.57 rows=1 width=27) (actual time=192669.605..192670.425 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual time=0.662..0.663 rows=1 loops=1)
           ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual time=0.661..0.662 rows=1 loops=1)
                 Sort Key: part.p_partkey
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8) (actual time=0.017..0.430 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..91026107.57 rows=1 width=27) (actual time=192669.604..192670.422 rows=0 loops=1)
         Workers Planned: 1
         Params Evaluated: $0
         Workers Launched: 1
         ->  Nested Loop Left Join  (cost=0.00..91025107.47 rows=1 width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
               Join Filter: ($0 IS NOT NULL)
               Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL))
               Rows Removed by Filter: 1810515312
               ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2)
               ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
 Planning Time: 0.174 ms
 Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider materialized form of the cheapest inner path.
When enable_material = true,  we can see Material path won in first plan, but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using seq scan path not material path. 

[1] include test table schema and data, you can repeat above problem.

I try fix this problem in attached patch, and I found pg12.12 also had this issue. Please review my patch, thanks! 

Attachment
After using patch, the result as below :
                                                                   QUERY PLAN                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1078.00..26630101.20 rows=1 width=27) (actual time=160571.005..160571.105 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual time=1.065..1.066 rows=1 loops=1)
           ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual time=1.064..1.065 rows=1 loops=1)
                 Sort Key: part.p_partkey
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8) (actual time=0.046..0.830 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..26630023.20 rows=1 width=27) (actual time=160571.003..160571.102 rows=0 loops=1)
         Workers Planned: 1
         Params Evaluated: $0
         Workers Launched: 1
         ->  Nested Loop Left Join  (cost=0.00..26629023.10 rows=1 width=27) (actual time=160549.257..160549.258 rows=0 loops=2)
               Join Filter: ($0 IS NOT NULL)
               Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL))
               Rows Removed by Filter: 1810515312
               ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2)
               ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.839 rows=60175 loops=60175)
                     ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2)
 Planning Time: 0.174 ms
 Execution Time: 160571.476 ms
(20 rows)

tender wang <tndrwang@gmail.com> 于2023年9月5日周二 16:52写道:
Hi all,

   I recently run benchmark[1] on master, but I found performance problem as below:

explain analyze select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
        ref_0.l_shipmode as c0,
        sample_0.l_orderkey as c1,
        sample_0.l_quantity as c2,
        ref_0.l_orderkey as c3,
        sample_0.l_shipmode as c5,
        ref_0.l_shipinstruct as c6
      from
        public.lineitem as ref_0
          left join public.lineitem as sample_0
          on ((select p_partkey from public.part order by p_partkey limit 1)
                 is not NULL)
      where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual time=299695.097..299695.099 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual time=0.651..0.652 rows=1 loops=1)
           ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual time=0.650..0.651 rows=1 loops=1)
                 Sort Key: part.p_partkey
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8) (actual time=0.013..0.428 rows=2000 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27) (actual time=299695.096..299695.096 rows=0 loops=1)
         Join Filter: ($0 IS NOT NULL)
         Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL))
         Rows Removed by Filter: 3621030625
         ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175 width=11) (actual time=0.026..6.225 rows=60175 loops=1)
         ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.554 rows=60175 loops=60175)
               ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
 Planning Time: 0.172 ms
 Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze  select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
        ref_0.l_shipmode as c0,
        sample_0.l_orderkey as c1,
        sample_0.l_quantity as c2,
        ref_0.l_orderkey as c3,
        sample_0.l_shipmode as c5,
        ref_0.l_shipinstruct as c6
      from
        public.lineitem as ref_0
          left join public.lineitem as sample_0
          on ((select p_partkey from public.part order by p_partkey limit 1)
                 is not NULL)
      where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
                                                                  QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1078.00..91026185.57 rows=1 width=27) (actual time=192669.605..192670.425 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual time=0.662..0.663 rows=1 loops=1)
           ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual time=0.661..0.662 rows=1 loops=1)
                 Sort Key: part.p_partkey
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8) (actual time=0.017..0.430 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..91026107.57 rows=1 width=27) (actual time=192669.604..192670.422 rows=0 loops=1)
         Workers Planned: 1
         Params Evaluated: $0
         Workers Launched: 1
         ->  Nested Loop Left Join  (cost=0.00..91025107.47 rows=1 width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
               Join Filter: ($0 IS NOT NULL)
               Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL))
               Rows Removed by Filter: 1810515312
               ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2)
               ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
 Planning Time: 0.174 ms
 Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider materialized form of the cheapest inner path.
When enable_material = true,  we can see Material path won in first plan, but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using seq scan path not material path. 

[1] include test table schema and data, you can repeat above problem.

I try fix this problem in attached patch, and I found pg12.12 also had this issue. Please review my patch, thanks! 


On Tue, Sep 5, 2023 at 4:52 PM tender wang <tndrwang@gmail.com> wrote:
   I recently run benchmark[1] on master, but I found performance problem as below:
...

I debug the code and find consider_parallel_nestloop() doesn't consider materialized form of the cheapest inner path.

Yeah, this seems an omission in commit 45be99f8.  I reviewed the patch
and here are some comments.

* I think we should not consider materializing the cheapest inner path
  if we're doing JOIN_UNIQUE_INNER, because in this case we have to
  unique-ify the inner path.

* I think we can check if it'd be parallel safe before creating the
  material path, thus avoid the creation in unsafe cases.

* I don't think the test case you added works for the code changes.
  Maybe a plan likes below is better:

explain (costs off)
select * from tenk1, tenk2 where tenk1.two = tenk2.two;
                  QUERY PLAN
----------------------------------------------
 Gather
   Workers Planned: 4
   ->  Nested Loop
         Join Filter: (tenk1.two = tenk2.two)
         ->  Parallel Seq Scan on tenk1
         ->  Materialize
               ->  Seq Scan on tenk2
(7 rows)

Thanks
Richard


Richard Guo <guofenglinux@gmail.com> 于2023年9月5日周二 18:51写道:

On Tue, Sep 5, 2023 at 4:52 PM tender wang <tndrwang@gmail.com> wrote:
   I recently run benchmark[1] on master, but I found performance problem as below:
...

I debug the code and find consider_parallel_nestloop() doesn't consider materialized form of the cheapest inner path.

Yeah, this seems an omission in commit 45be99f8.  I reviewed the patch
and here are some comments.

* I think we should not consider materializing the cheapest inner path
  if we're doing JOIN_UNIQUE_INNER, because in this case we have to
  unique-ify the inner path.
 
     That's right. The V2 patch has been fixed. 


* I think we can check if it'd be parallel safe before creating the
  material path, thus avoid the creation in unsafe cases.
       
    Agreed.
     
     
* I don't think the test case you added works for the code changes.
  Maybe a plan likes below is better:
 
      Agreed.

explain (costs off)
select * from tenk1, tenk2 where tenk1.two = tenk2.two;
                  QUERY PLAN
----------------------------------------------
 Gather
   Workers Planned: 4
   ->  Nested Loop
         Join Filter: (tenk1.two = tenk2.two)
         ->  Parallel Seq Scan on tenk1
         ->  Materialize
               ->  Seq Scan on tenk2
(7 rows)

Thanks
Richard
Attachment
On Tue, Sep 5, 2023 at 8:07 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Yeah, this seems an omission in commit 45be99f8.

It's been a while, but I think I omitted this deliberately because I
didn't really understand the value of it and wanted to keep the
planning cost down.

The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

--
Robert Haas
EDB: http://www.enterprisedb.com




On Fri, Sep 8, 2023 at 3:15 AM Robert Haas <robertmhaas@gmail.com> wrote:
The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

Yes the given example query is not that convincing.  I tried a query
with plans as below (after some GUC setting) which might be more
realistic in real world.

unpatched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Gather  (cost=0.00..5522666.44 rows=160466667 width=301)
   Workers Planned: 4
   ->  Nested Loop  (cost=0.00..5522666.44 rows=40116667 width=301)
         Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
         ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044 width=144)
         ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
(6 rows)

patched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Gather  (cost=0.00..1807085.44 rows=160466667 width=301)
   Workers Planned: 4
   ->  Nested Loop  (cost=0.00..1807085.44 rows=40116667 width=301)
         Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
         ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044 width=144)
         ->  Materialize  (cost=0.00..307.00 rows=8000 width=157)
               ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
(7 rows)

The execution time (ms) are (avg of 3 runs):

unpatched:  71769.21
patched:    65510.04

So we can see some (~9%) performance gains in this case.

Thanks
Richard
Hi tom,
   Do you have any comments or suggestions on this issue? Thanks.

Richard Guo <guofenglinux@gmail.com> 于2023年9月8日周五 14:06写道:

On Fri, Sep 8, 2023 at 3:15 AM Robert Haas <robertmhaas@gmail.com> wrote:
The example query provided here seems rather artificial. Surely few
people write a join clause that references neither of the tables being
joined. Is there a more realistic case where this makes a big
difference?

Yes the given example query is not that convincing.  I tried a query
with plans as below (after some GUC setting) which might be more
realistic in real world.

unpatched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Gather  (cost=0.00..5522666.44 rows=160466667 width=301)
   Workers Planned: 4
   ->  Nested Loop  (cost=0.00..5522666.44 rows=40116667 width=301)
         Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
         ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044 width=144)
         ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
(6 rows)

patched:

explain select * from partsupp join lineitem on l_partkey > ps_partkey;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Gather  (cost=0.00..1807085.44 rows=160466667 width=301)
   Workers Planned: 4
   ->  Nested Loop  (cost=0.00..1807085.44 rows=40116667 width=301)
         Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
         ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044 width=144)
         ->  Materialize  (cost=0.00..307.00 rows=8000 width=157)
               ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
(7 rows)

The execution time (ms) are (avg of 3 runs):

unpatched:  71769.21
patched:    65510.04

So we can see some (~9%) performance gains in this case.

Thanks
Richard
On Fri, 8 Sept 2023 at 09:41, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Sep 5, 2023 at 8:07 AM Richard Guo <guofenglinux@gmail.com> wrote:
> > Yeah, this seems an omission in commit 45be99f8.
>
> It's been a while, but I think I omitted this deliberately because I
> didn't really understand the value of it and wanted to keep the
> planning cost down.

I think the value is potentially not having to repeatedly execute some
expensive rescan to the nested loop join once for each outer-side
tuple.

The planning cost is something to consider for sure, but it seems
strange that we deemed it worthy to consider material paths for the
non-parallel input paths but draw the line for the parallel/partial
ones. It seems to me that the additional costs and the possible
benefits are the same for both.

David



On Fri, 8 Sept 2023 at 19:14, Richard Guo <guofenglinux@gmail.com> wrote:
> explain select * from partsupp join lineitem on l_partkey > ps_partkey;
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Gather  (cost=0.00..1807085.44 rows=160466667 width=301)
>    Workers Planned: 4
>    ->  Nested Loop  (cost=0.00..1807085.44 rows=40116667 width=301)
>          Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
>          ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044 width=144)
>          ->  Materialize  (cost=0.00..307.00 rows=8000 width=157)
>                ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
> (7 rows)
>
> The execution time (ms) are (avg of 3 runs):
>
> unpatched:  71769.21
> patched:    65510.04

This gap would be wider if the partsupp Seq Scan were filtering off
some rows and wider still if you added more rows to lineitem.
However, a clauseless seqscan is not the most compelling use case
below a material node. The inner side of the nested loop could be some
subquery that takes 6 days to complete. Running the 6 day query ~15044
times seems like something that would be good to avoid.

It seems worth considering Material paths to me.  I think that the
above example could be tuned any way you like to make it look better
or worse.

David



Hi!

Thank you for your work on the subject.


I reviewed your patch and found that your commit message does not fully explain your code, in addition, I found several spelling mistakes.

I think it's better to change to:With parallel seqscan, we should consider materializing the cheapest inner path in
case of nested loop if it doesn't contain a unique node or it is unsafe to use it in a subquery.


Besides, I couldn't understand why we again check that material path is safe?

if (matpath != NULL && matpath->parallel_safe)
            try_partial_nestloop_path(root, joinrel, outerpath, matpath,
                                      pathkeys, jointype, extra);

-- 
Regards,
Alena Rybakina

> On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:
>
>    Do you have any comments or suggestions on this issue? Thanks.
Hi Tender,

there are some review comments in the thread, that you might be interested in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

Thanks!


Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/




Andrey M. Borodin <x4mmm@yandex-team.ru> 于2024年4月8日周一 17:40写道:


> On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:
>
>    Do you have any comments or suggestions on this issue? Thanks.
Hi Tender,

there are some review comments in the thread, that you might be interested in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

 Thank you for the reminder.  I will update the patch later.
I also deeply hope to get more advice about this patch.
(even the advice that not worth  continuint to work on this patch).

Thanks.

Thanks!


Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/


--
Tender Wang
OpenPie:  https://en.openpie.com/


Andrey M. Borodin <x4mmm@yandex-team.ru> 于2024年4月8日周一 17:40写道:


> On 27 Sep 2023, at 16:06, tender wang <tndrwang@gmail.com> wrote:
>
>    Do you have any comments or suggestions on this issue? Thanks.
Hi Tender,

there are some review comments in the thread, that you might be interested in.
I'll mark this [0] entry "Waiting on Author" and move to next CF.

Thanks!


Best regards, Andrey Borodin.

[0]https://commitfest.postgresql.org/47/4549/

I have rebased master and fixed a plan diff case.
--
Tender Wang
OpenPie:  https://en.openpie.com/
Attachment
On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote:
>
[ cut ]
>
> I have rebased master and fixed a plan diff case.

We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch
at PgConf.dev Patch Review Workshop.
Here are our findings.

Patch tries to allow for using materialization together
with parallel subqueries.
It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac
(current HEAD).
Tests pass locally on macOS and Linux in VM under Windows.
Tests are also green in cfbot (for last 2 weeks; they were
red previously, probably because of need to rebase).

Please add more tests.  Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

This patch tries to apply materialization only when join type
is not JOIN_UNIQUE_INNER. Comment mentions this, but does not
explain why. So please either add comment describing reason for that
or try enabling materialization in such a case.

Best regards.

--
Tomasz Rybak, Debian Developer <serpent@debian.org>
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C





Tomasz Rybak <tomasz.rybak@post.pl> 于2024年5月31日周五 04:21写道:
On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote:
>
[ cut ]
>
> I have rebased master and fixed a plan diff case.

We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch
at PgConf.dev Patch Review Workshop.
 
Thanks for reviewing this patch. 
Here are our findings.

Patch tries to allow for using materialization together
with parallel subqueries.
It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac
(current HEAD).
Tests pass locally on macOS and Linux in VM under Windows.
Tests are also green in cfbot (for last 2 weeks; they were
red previously, probably because of need to rebase).

Please add more tests.  Especially please add some negative tests;
current patch checks that it is safe to apply materialization. It would
be helpful to add tests checking that materialization is not applied
in both checked cases:
1. when inner join path is not parallel safe
2. when matpath is not parallel safe

I added a test case that inner rel is not parallel safe. Actually, matpath will not create
if inner rel is not parallel safe. So I didn't add test case for the second  scenario.

This patch tries to apply materialization only when join type
is not JOIN_UNIQUE_INNER. Comment mentions this, but does not
explain why. So please either add comment describing reason for that
or try enabling materialization in such a case.
 
Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.
 
We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in match_unsorted_order().
So here is as same logic as match_unsorted_order(). I added comments to explain why.



[1] https://www.postgresql.org/message-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA%40mail.gmail.com



--
Tender Wang
OpenPie:  https://en.openpie.com/
Attachment

RE: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

From
"Fujii.Yuki@df.MitsubishiElectric.co.jp"
Date:
Hi. Tender.

Thank you for modification.

> From: Tender Wang <tndrwang@gmail.com>
> Sent: Tuesday, June 4, 2024 7:51 PM
>     Please add more tests.  Especially please add some negative tests;
>     current patch checks that it is safe to apply materialization. It would
>     be helpful to add tests checking that materialization is not applied
>     in both checked cases:
>     1. when inner join path is not parallel safe
>     2. when matpath is not parallel safe
> 
> 
> 
> I added a test case that inner rel is not parallel safe. Actually, 
> matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second  scenario.
Is there case in which matpath is not parallel safe and inner rel is parallel safe?
If right, I think that it would be suitable to add a negative test in a such case.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation





Hi. Tender.

Thank you for modification.

> From: Tender Wang <tndrwang@gmail.com>
> Sent: Tuesday, June 4, 2024 7:51 PM
>       Please add more tests.  Especially please add some negative tests;
>       current patch checks that it is safe to apply materialization. It would
>       be helpful to add tests checking that materialization is not applied
>       in both checked cases:
>       1. when inner join path is not parallel safe
>       2. when matpath is not parallel safe
>
>
>
> I added a test case that inner rel is not parallel safe. Actually,
> matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second  scenario.
Is there case in which matpath is not parallel safe and inner rel is parallel safe?
If right, I think that it would be suitable to add a negative test in a such case.

I looked through create_xxx_path(), and I found that almost path.parallel_safe is assigned from RelOptiInfo.consider_parallel.
Some pathes take subpath->parallel_safe into account(e.g. Material path). In most cases, Material is parallel_safe if rel is parallel
safe. Now I haven't come up a query plan that material is un parallel-safe but rel is parallel-safe.
 

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation




--
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

From
"Fujii.Yuki@df.MitsubishiElectric.co.jp"
Date:
Hi. Tender.

> From: Tender Wang <tndrwang@gmail.com>
> Sent: Tuesday, June 11, 2024 5:11 PM
> 
>     > From: Tender Wang <tndrwang@gmail.com <mailto:tndrwang@gmail.com> >
>     > Sent: Tuesday, June 4, 2024 7:51 PM
>     >       Please add more tests.  Especially please add some negative tests;
>     >       current patch checks that it is safe to apply materialization. It would
>     >       be helpful to add tests checking that materialization is not applied
>     >       in both checked cases:
>     >       1. when inner join path is not parallel safe
>     >       2. when matpath is not parallel safe
>     >
>     >
>     >
>     > I added a test case that inner rel is not parallel safe. Actually,
>     > matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second  scenario.
>     Is there case in which matpath is not parallel safe and inner rel is parallel safe?
>     If right, I think that it would be suitable to add a negative test in a such case.
> 
> 
> 
> I looked through create_xxx_path(), and I found that almost path.parallel_safe is assigned from
> RelOptiInfo.consider_parallel.
> Some pathes take subpath->parallel_safe into account(e.g. Material path). In most cases, Material is parallel_safe if
relis
 
> parallel safe. Now I haven't come up a query plan that material is un parallel-safe but rel is parallel-safe.
Thank you for looking into the source code. I understand the situation now.

Sincerely yours,
Yuki Fujii

--
Yuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Hi Robert,

    Since this patch had been reviewed at PgConf.dev Patch Review Workshop.  And I have updated
the patch according to the review advice. Now there are no others to comment this patch. 
The status of this patch on commitfest have stayed "need review" for a long time.  
I want to know if it is ready to move to the next status "Ready for commiter".

Thanks.

--
Tender Wang
On Tue, Jun 4, 2024 at 6:51 PM Tender Wang <tndrwang@gmail.com> wrote:
> Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]
>
> * I think we should not consider materializing the cheapest inner path
> if we're doing JOIN_UNIQUE_INNER, because in this case we have to
> unique-ify the inner path.
>
> We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in match_unsorted_order().
> So here is as same logic as match_unsorted_order(). I added comments to explain why.

I looked through the v4 patch and found an issue.  For the plan diff:

+         ->  Nested Loop
+               ->  Parallel Seq Scan on prt1_p1 t1_1
+               ->  Materialize
+                     ->  Sample Scan on prt1_p1 t2_1
+                           Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+                           Filter: (t1_1.a = a)

This does not seem correct to me.  The inner path is parameterized by
the outer rel, in which case it does not make sense to add a Materialize
node on top of it.

I updated the patch to include a check in consider_parallel_nestloop
ensuring that inner_cheapest_total is not parameterized by outerrel
before materializing it.  I also tweaked the comments, test cases and
commit message.

Thanks
Richard

Attachment


Richard Guo <guofenglinux@gmail.com> 于2024年6月18日周二 17:24写道:
On Tue, Jun 4, 2024 at 6:51 PM Tender Wang <tndrwang@gmail.com> wrote:
> Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]
>
> * I think we should not consider materializing the cheapest inner path
> if we're doing JOIN_UNIQUE_INNER, because in this case we have to
> unique-ify the inner path.
>
> We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in match_unsorted_order().
> So here is as same logic as match_unsorted_order(). I added comments to explain why.

I looked through the v4 patch and found an issue.  For the plan diff:

+         ->  Nested Loop
+               ->  Parallel Seq Scan on prt1_p1 t1_1
+               ->  Materialize
+                     ->  Sample Scan on prt1_p1 t2_1
+                           Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+                           Filter: (t1_1.a = a)

This does not seem correct to me.  The inner path is parameterized by
the outer rel, in which case it does not make sense to add a Materialize
node on top of it.

Yeah, you're right. 

I updated the patch to include a check in consider_parallel_nestloop
ensuring that inner_cheapest_total is not parameterized by outerrel
before materializing it.  I also tweaked the comments, test cases and
commit message.

Thanks for the work. Now it looks better.
I have changed the status from "need review" to "ready for commiters"  on the commitfest.

--
Tender Wang
On Wed, Jun 19, 2024 at 10:55 AM Tender Wang <tndrwang@gmail.com> wrote:
> Richard Guo <guofenglinux@gmail.com> 于2024年6月18日周二 17:24写道:
>> I updated the patch to include a check in consider_parallel_nestloop
>> ensuring that inner_cheapest_total is not parameterized by outerrel
>> before materializing it.  I also tweaked the comments, test cases and
>> commit message.
>
> Thanks for the work. Now it looks better.
> I have changed the status from "need review" to "ready for commiters"  on the commitfest.

Here is a new rebase.

I'm planning to push it next week, barring any objections.

Thanks
Richard

Attachment
On Sat, Jul 6, 2024 at 5:32 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Here is a new rebase.
>
> I'm planning to push it next week, barring any objections.

Pushed.

Thanks
Richard





Richard Guo <guofenglinux@gmail.com> 于2024年7月12日周五 10:30写道:
On Sat, Jul 6, 2024 at 5:32 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Here is a new rebase.
>
> I'm planning to push it next week, barring any objections.

Pushed.

Thanks
Richard

Thanks for pushing.

--
Tender Wang
Hello Richard,

12.07.2024 05:29, Richard Guo wrote:
> On Sat, Jul 6, 2024 at 5:32 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> Here is a new rebase.
>>
>> I'm planning to push it next week, barring any objections.
> Pushed.

Please look at a recent buildfarm failure [1], which shows some
instability of that test addition:
  -- the joinrel is not parallel-safe due to the OFFSET clause in the subquery
  explain (costs off)
      select * from tenk1 t1, (select * from tenk2 t2 offset 0) t2 where t1.two > t2.two;
-                QUERY PLAN
--------------------------------------------
+                   QUERY PLAN
+-------------------------------------------------
   Nested Loop
     Join Filter: (t1.two > t2.two)
-   ->  Gather
-         Workers Planned: 4
-         ->  Parallel Seq Scan on tenk1 t1
+   ->  Seq Scan on tenk2 t2
     ->  Materialize
-         ->  Seq Scan on tenk2 t2
+         ->  Gather
+               Workers Planned: 4
+               ->  Parallel Seq Scan on tenk1 t1
  (7 rows)

I've managed to reproduce this plan change when running
multiple 027_stream_regress.pl instances simultaneously, with
parallel_schedule reduced to:
test: test_setup
test: create_misc
test: create_index
test: sanity_check
test: select_parallel

I've added the following to the test and got two verbose plans for
comparison (see the attachment).
  -- the joinrel is not parallel-safe due to the OFFSET clause in the subquery
  explain (costs off)
      select * from tenk1 t1, (select * from tenk2 t2 offset 0) t2 where t1.two > t2.two;
+\o plan.txt
+explain (verbose)
+    select * from tenk1 t1, (select * from tenk2 t2 offset 0) t2 where t1.two > t2.two;
+\o
  alter table tenk2 reset (parallel_workers);

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-07-17%2017%3A12%3A53

Best regards,
Alexander
Attachment
On Thu, Jul 18, 2024 at 4:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> Please look at a recent buildfarm failure [1], which shows some
> instability of that test addition:
>   -- the joinrel is not parallel-safe due to the OFFSET clause in the subquery
>   explain (costs off)
>       select * from tenk1 t1, (select * from tenk2 t2 offset 0) t2 where t1.two > t2.two;
> -                QUERY PLAN
> --------------------------------------------
> +                   QUERY PLAN
> +-------------------------------------------------
>    Nested Loop
>      Join Filter: (t1.two > t2.two)
> -   ->  Gather
> -         Workers Planned: 4
> -         ->  Parallel Seq Scan on tenk1 t1
> +   ->  Seq Scan on tenk2 t2
>      ->  Materialize
> -         ->  Seq Scan on tenk2 t2
> +         ->  Gather
> +               Workers Planned: 4
> +               ->  Parallel Seq Scan on tenk1 t1
>   (7 rows)

Thank you for the report and investigation.  Will have a look.

Thanks
Richard



On Thu, Jul 18, 2024 at 4:11 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Thu, Jul 18, 2024 at 4:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> > Please look at a recent buildfarm failure [1], which shows some
> > instability of that test addition:
> >   -- the joinrel is not parallel-safe due to the OFFSET clause in the subquery
> >   explain (costs off)
> >       select * from tenk1 t1, (select * from tenk2 t2 offset 0) t2 where t1.two > t2.two;
> > -                QUERY PLAN
> > --------------------------------------------
> > +                   QUERY PLAN
> > +-------------------------------------------------
> >    Nested Loop
> >      Join Filter: (t1.two > t2.two)
> > -   ->  Gather
> > -         Workers Planned: 4
> > -         ->  Parallel Seq Scan on tenk1 t1
> > +   ->  Seq Scan on tenk2 t2
> >      ->  Materialize
> > -         ->  Seq Scan on tenk2 t2
> > +         ->  Gather
> > +               Workers Planned: 4
> > +               ->  Parallel Seq Scan on tenk1 t1
> >   (7 rows)
>
> Thank you for the report and investigation.  Will have a look.

The problemed plan is a non-parallel nestloop join.  It's just chance
which join order the planner will pick, and slight variations in
underlying statistics could result in a different displayed plan.
From the two verbose plans, we can see slight variations in the
statistics for the parallel seqscan of tenk1.

->  Parallel Seq Scan on public.tenk1 t1  (cost=0.00..370.00 rows=2500
width=244)

VS.

->  Parallel Seq Scan on public.tenk1 t1  (cost=0.00..369.99 rows=2499
width=244)

I have no idea why the underlying statistics changed, but it seems
that this slight change is sufficent to result in a different plan.

According to the discussion in [1], I think what we wanted to test
with this query is that parallel nestloop join is not generated if the
inner path is not parallel-safe.  Therefore, I modified this test case
to use a lateral join, rendering the inner path not parallel-safe
while also enforcing the join order.  Please see attached.

[1] https://postgr.es/m/5641923793cef7706395a34e62538b75d05e498b.camel@post.pl

Thanks
Richard

Attachment
Hello Richard,

18.07.2024 17:30, Richard Guo wrote:
> The problemed plan is a non-parallel nestloop join.  It's just chance
> which join order the planner will pick, and slight variations in
> underlying statistics could result in a different displayed plan.
>  From the two verbose plans, we can see slight variations in the
> statistics for the parallel seqscan of tenk1.
>
> ->  Parallel Seq Scan on public.tenk1 t1  (cost=0.00..370.00 rows=2500
> width=244)
>
> VS.
>
> ->  Parallel Seq Scan on public.tenk1 t1  (cost=0.00..369.99 rows=2499
> width=244)
>
> I have no idea why the underlying statistics changed, but it seems
> that this slight change is sufficent to result in a different plan.

I think it could be caused by the same reason as [1] and I really can
easily (without multiple instances/loops. just with `make check`) reproduce
the failure with cranky-ConditionalLockBufferForCleanup.patch (but
targeted for "VACUUM ANALYZE tenk1;").

> According to the discussion in [1], I think what we wanted to test
> with this query is that parallel nestloop join is not generated if the
> inner path is not parallel-safe.  Therefore, I modified this test case
> to use a lateral join, rendering the inner path not parallel-safe
> while also enforcing the join order.  Please see attached.

The modified test survives my testing procedure. Thank you for the patch!

[1] https://www.postgresql.org/message-id/flat/66eb9a6e-fc67-a230-c5b1-2a741e8b88c6%40gmail.com

Best regards,
Alexander



On Fri, Jul 19, 2024 at 12:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> 18.07.2024 17:30, Richard Guo wrote:
> > I have no idea why the underlying statistics changed, but it seems
> > that this slight change is sufficent to result in a different plan.
>
> I think it could be caused by the same reason as [1] and I really can
> easily (without multiple instances/loops. just with `make check`) reproduce
> the failure with cranky-ConditionalLockBufferForCleanup.patch (but
> targeted for "VACUUM ANALYZE tenk1;").

Yeah.  Anyway I think we need to make the test more tolerant of slight
variations in the statistics.

> > According to the discussion in [1], I think what we wanted to test
> > with this query is that parallel nestloop join is not generated if the
> > inner path is not parallel-safe.  Therefore, I modified this test case
> > to use a lateral join, rendering the inner path not parallel-safe
> > while also enforcing the join order.  Please see attached.
>
> The modified test survives my testing procedure. Thank you for the patch!

Thanks for testing this patch.  I've pushed it.

Thanks
Richard