RE: Parallel plans and "union all" subquery - Mailing list pgsql-hackers

From Phil Florent
Subject RE: Parallel plans and "union all" subquery
Date
Msg-id PA4P189MB12802CAB32A5B56161B649A0BAFC0@PA4P189MB1280.EURP189.PROD.OUTLOOK.COM
Whole thread Raw
In response to Re: Parallel plans and "union all" subquery  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel plans and "union all" subquery  (Luc Vlaming <luc@swarm64.com>)
List pgsql-hackers
Hi Greg,

The implicit conversion was the cause of the non parallel plan, thanks for the explanation and the workarounds. It can cause a huge difference in terms of performance, I will give the information to our developers.

Regards,

Phil




De : Greg Nancarrow <gregn4422@gmail.com>
Envoyé : lundi 23 novembre 2020 06:04
À : Phil Florent <philflorent@hotmail.com>
Cc : pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
Objet : Re: Parallel plans and "union all" subquery
 
On Sun, Nov 22, 2020 at 11:51 PM Phil Florent <philflorent@hotmail.com> wrote:
>
>
> Hi,
>
>
> I have a question about parallel plans. I also posted it on the general list but perhaps it's a question for hackers. Here is my test case :
>
>
> explain
> select count(*)
> from (select
> n1
> from drop_me
> union all
> values(1)) ua;
>
>
> QUERY PLAN
> --------------------------------------------------------------------------------
> Aggregate (cost=2934739.24..2934739.25 rows=1 width=8)
> -> Append (cost=0.00..2059737.83 rows=70000113 width=32)
> -> Seq Scan on drop_me (cost=0.00..1009736.12 rows=70000112 width=6)
> -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=32)
> -> Result (cost=0.00..0.01 rows=1 width=4)
> JIT:
> Functions: 4
> Options: Inlining true, Optimization true, Expressions true, Deforming true
>
>
> No parallel plan, 2s6
>
>
> I read the documentation but I don't get the reason of the "noparallel" seq scan of drop_me in the last case ?
>

Without debugging this, it looks to me that the UNION type resolution
isn't working as well as it possibly could in this case, for the
generation of a parallel plan. I found that with a minor tweak to your
SQL, either for the table creation or query, it will produce a
parallel plan.

Noting that currently you're creating the drop_me table with a
"numeric" column, you can either:

(1) Change the table creation

FROM:
    create unlogged table drop_me as select generate_series(1,7e7) n1;
TO:
    create unlogged table drop_me as select generate_series(1,7e7)::int n1;


OR


(2) Change the query

FROM:
    explain
    select count(*)
    from (select
    n1
    from drop_me
    union all
    values(1)) ua;

TO:

    explain
    select count(*)
    from (select
    n1
    from drop_me
    union all
    values(1::numeric)) ua;


                                            QUERY PLAN
----------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=821152.71..821152.72 rows=1 width=8)
   ->  Gather  (cost=821152.50..821152.71 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=820152.50..820152.51 rows=1 width=8)
               ->  Parallel Append  (cost=0.00..747235.71 rows=29166714 width=0)
                     ->  Result  (cost=0.00..0.01 rows=1 width=0)
                     ->  Parallel Seq Scan on drop_me
(cost=0.00..601402.13 rows=29166713 width=0)
(7 rows)


Regards,
Greg Nancarrow
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Anastasia Lubennikova
Date:
Subject: Re: Online verification of checksums