Re: parallel.sgml for Gather with InitPlans - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: parallel.sgml for Gather with InitPlans
Date
Msg-id CAA4eK1JFN-F8PjYCmqkXj3j6=BqSRLfSA94Nfy3kRr4hA86-oQ@mail.gmail.com
Whole thread Raw
In response to parallel.sgml for Gather with InitPlans  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: parallel.sgml for Gather with InitPlans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, May 7, 2018 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> In the wake of commit e89a71fb449af2ef74f47be1175f99956cf21524,
> parallel.sgml is no longer correct about the effect of InitPlans:
>
>   <para>
>     The following operations are always parallel restricted.
>   </para>
>
> ...
>
>       <para>
>         Access to an <literal>InitPlan</literal> or correlated
> <literal>SubPlan</literal>.
>       </para>
>
> I thought about this a bit and came up with the attached patch.
>

-        Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+        Plan nodes to which an <literal>InitPlan</literal> is attached.
+      </para>
+    </listitem>

Is this correct?  See below example:

Serial-Plan
-----------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
                             QUERY PLAN
--------------------------------------------------------------------
 Seq Scan on t1  (cost=35.51..71.01 rows=10 width=12)
   Filter: (k = $0)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=35.50..35.51 rows=1 width=4)
           ->  Seq Scan on t3  (cost=0.00..30.40 rows=2040 width=4)
(5 rows)

Parallel-Plan
--------------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Gather  (cost=9.71..19.38 rows=2 width=12)
   Workers Planned: 2
   Params Evaluated: $1
   InitPlan 1 (returns $1)
     ->  Finalize Aggregate  (cost=9.70..9.71 rows=1 width=4)
           ->  Gather  (cost=9.69..9.70 rows=2 width=4)
                 Workers Planned: 2
                 ->  Partial Aggregate  (cost=9.69..9.70 rows=1 width=4)
                       ->  Parallel Seq Scan on t3  (cost=0.00..8.75
rows=375 width=4)
   ->  Parallel Seq Scan on t1  (cost=0.00..9.67 rows=1 width=12)
         Filter: (k = $1)
(11 rows)

In the above example, InitPlan is attached to a Plan node (Seq Scan
t1) which is not a parallel restricted.

>  Other ideas?
>

How about changing the statement as:
-        Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+        Access to a correlated <literal>SubPlan</literal>.


I think we can cover InitPlan and Subplans that can be parallelized in
a separate section "Parallel Subplans" or some other heading.  I think
as of now we have enabled parallel subplans and initplans in a
limited, but useful cases (as per TPC-H benchmark) and it might be
good to cover them in a separate section.  I can come up with an
initial patch (or I can review it if you write the patch) if you and
or others think that makes sense.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: perlcritic and perltidy
Next
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Parallel Append implementation