Re: Parallel plans and "union all" subquery - Mailing list pgsql-hackers

From Luc Vlaming
Subject Re: Parallel plans and "union all" subquery
Date
Msg-id 3d8fe7f0-692a-89c5-2ec2-64afd6d91c3e@swarm64.com
Whole thread Raw
In response to Re: Parallel plans and "union all" subquery  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel plans and "union all" subquery  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On 24-11-2020 01:44, Greg Nancarrow wrote:
> On Tue, Nov 24, 2020 at 2:34 AM Luc Vlaming <luc@swarm64.com> wrote:
>>
>> Hi,
>>
>> For this problem there is a patch I created, which is registered under
>> https://commitfest.postgresql.org/30/2787/ that should fix this without
>> any workarounds. Maybe someone can take a look at it?
>>
> 
> I tried your patch with the latest PG source code (24/11), but
> unfortunately a non-parallel plan was still produced in this case.
> 
> test=# explain
> select count(*)
> from (select
> n1
> from drop_me
> union all
> values(1)) ua;
>                                     QUERY PLAN
> --------------------------------------------------------------------------------
>   Aggregate  (cost=1889383.54..1889383.55 rows=1 width=8)
>     ->  Append  (cost=0.00..1362834.03 rows=42123961 width=32)
>           ->  Seq Scan on drop_me  (cost=0.00..730974.60 rows=42123960 width=32)
>           ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=32)
>                 ->  Result  (cost=0.00..0.01 rows=1 width=4)
> (5 rows)
> 
> 
> That's not to say your patch doesn't have merit - but maybe just not a
> fix for this particular case.
> 
> As before, if the SQL is tweaked to align the types for the UNION, you
> get a parallel plan:
> 
> test=# explain
> select count(*)
> from (select
> n1
> from drop_me
> union all
> values(1::numeric)) ua;
>                                               QUERY PLAN
> ----------------------------------------------------------------------------------------------------
>   Finalize Aggregate  (cost=821152.71..821152.72 rows=1 width=8)
>     ->  Gather  (cost=821152.50..821152.71 rows=2 width=8)
>           Workers Planned: 2
>           ->  Partial Aggregate  (cost=820152.50..820152.51 rows=1 width=8)
>                 ->  Parallel Append  (cost=0.00..747235.71 rows=29166714 width=0)
>                       ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                       ->  Parallel Seq Scan on drop_me
> (cost=0.00..601402.13 rows=29166713 width=0)
> (7 rows)
> 
> 
> Regards,
> Greg Nancarrow
> Fujitsu Australia
> 

Hi,

You're completely right, sorry for my error. I was too quick on assuming 
my patch would work for this specific case too; I should have tested 
that before replying. It looked very similar but turns out to not work 
because of the upper rel not being considered parallel.

I would like to extend my patch to support this, or create a second 
patch. This would however be significantly more involved because it 
would require that we (always?) consider two paths whenever we process a 
subquery: the best parallel plan and the best serial plan. Before I 
emback on such a journey I would like some input on whether this would 
be a very bad idea. Thoughts?

Regards,
Luc
Swarm64



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: walsender bug: stuck during shutdown
Next
From: Peter Eisentraut
Date:
Subject: Re: abstract Unix-domain sockets