Thread: parallel.sgml for Gather with InitPlans

parallel.sgml for Gather with InitPlans

From
Robert Haas
Date:
In the wake of commit e89a71fb449af2ef74f47be1175f99956cf21524,
parallel.sgml is no longer correct about the effect of InitPlans:

  <para>
    The following operations are always parallel restricted.
  </para>

...

      <para>
        Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
      </para>

I thought about this a bit and came up with the attached patch.  Other ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: parallel.sgml for Gather with InitPlans

From
Amit Kapila
Date:
On Mon, May 7, 2018 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> In the wake of commit e89a71fb449af2ef74f47be1175f99956cf21524,
> parallel.sgml is no longer correct about the effect of InitPlans:
>
>   <para>
>     The following operations are always parallel restricted.
>   </para>
>
> ...
>
>       <para>
>         Access to an <literal>InitPlan</literal> or correlated
> <literal>SubPlan</literal>.
>       </para>
>
> I thought about this a bit and came up with the attached patch.
>

-        Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+        Plan nodes to which an <literal>InitPlan</literal> is attached.
+      </para>
+    </listitem>

Is this correct?  See below example:

Serial-Plan
-----------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
                             QUERY PLAN
--------------------------------------------------------------------
 Seq Scan on t1  (cost=35.51..71.01 rows=10 width=12)
   Filter: (k = $0)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=35.50..35.51 rows=1 width=4)
           ->  Seq Scan on t3  (cost=0.00..30.40 rows=2040 width=4)
(5 rows)

Parallel-Plan
--------------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Gather  (cost=9.71..19.38 rows=2 width=12)
   Workers Planned: 2
   Params Evaluated: $1
   InitPlan 1 (returns $1)
     ->  Finalize Aggregate  (cost=9.70..9.71 rows=1 width=4)
           ->  Gather  (cost=9.69..9.70 rows=2 width=4)
                 Workers Planned: 2
                 ->  Partial Aggregate  (cost=9.69..9.70 rows=1 width=4)
                       ->  Parallel Seq Scan on t3  (cost=0.00..8.75
rows=375 width=4)
   ->  Parallel Seq Scan on t1  (cost=0.00..9.67 rows=1 width=12)
         Filter: (k = $1)
(11 rows)

In the above example, InitPlan is attached to a Plan node (Seq Scan
t1) which is not a parallel restricted.

>  Other ideas?
>

How about changing the statement as:
-        Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+        Access to a correlated <literal>SubPlan</literal>.


I think we can cover InitPlan and Subplans that can be parallelized in
a separate section "Parallel Subplans" or some other heading.  I think
as of now we have enabled parallel subplans and initplans in a
limited, but useful cases (as per TPC-H benchmark) and it might be
good to cover them in a separate section.  I can come up with an
initial patch (or I can review it if you write the patch) if you and
or others think that makes sense.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: parallel.sgml for Gather with InitPlans

From
Robert Haas
Date:
On Mon, May 7, 2018 at 11:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Is this correct?  See below example:

That's not a counterexample to what I wrote.  When parallelism is
used, the InitPlan has to be attached to a parallel-restricted node,
and it is: Gather.  It's true that in the serial plan it's attached to
the Seq Scan, but that doesn't prove anything.  Saying that something
is parallel-restricted is a statement about where parallelism can be
used; it says nothing about what happens without parallelism.

> I think we can cover InitPlan and Subplans that can be parallelized in
> a separate section "Parallel Subplans" or some other heading.  I think
> as of now we have enabled parallel subplans and initplans in a
> limited, but useful cases (as per TPC-H benchmark) and it might be
> good to cover them in a separate section.  I can come up with an
> initial patch (or I can review it if you write the patch) if you and
> or others think that makes sense.

We could go that way, but what I wrote is short and -- I think -- accurate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: parallel.sgml for Gather with InitPlans

From
Amit Kapila
Date:
On Tue, May 8, 2018 at 5:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, May 7, 2018 at 11:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I think we can cover InitPlan and Subplans that can be parallelized in
>> a separate section "Parallel Subplans" or some other heading.  I think
>> as of now we have enabled parallel subplans and initplans in a
>> limited, but useful cases (as per TPC-H benchmark) and it might be
>> good to cover them in a separate section.  I can come up with an
>> initial patch (or I can review it if you write the patch) if you and
>> or others think that makes sense.
>
> We could go that way, but what I wrote is short and -- I think -- accurate.
>

Okay, again thinking about it after your explanation, it appears
correct to me, but it was not apparent on the first read.   I think
other alternatives could be (a) Evaluation of initplan OR (b)
Execution of initplan.  I think it makes sense to add what you have
written or one of the alternatives suggested by me as you deem most
appropriate.  I think one can always write a detailed explanation as a
separate patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: parallel.sgml for Gather with InitPlans

From
Robert Haas
Date:
On Tue, May 8, 2018 at 11:21 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, May 8, 2018 at 5:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, May 7, 2018 at 11:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> I think we can cover InitPlan and Subplans that can be parallelized in
>>> a separate section "Parallel Subplans" or some other heading.  I think
>>> as of now we have enabled parallel subplans and initplans in a
>>> limited, but useful cases (as per TPC-H benchmark) and it might be
>>> good to cover them in a separate section.  I can come up with an
>>> initial patch (or I can review it if you write the patch) if you and
>>> or others think that makes sense.
>>
>> We could go that way, but what I wrote is short and -- I think -- accurate.
>>
>
> Okay, again thinking about it after your explanation, it appears
> correct to me, but it was not apparent on the first read.   I think
> other alternatives could be (a) Evaluation of initplan OR (b)
> Execution of initplan.  I think it makes sense to add what you have
> written or one of the alternatives suggested by me as you deem most
> appropriate.  I think one can always write a detailed explanation as a
> separate patch.

I've committed what I suggested before.  If you want to propose
another patch, feel free, but I'm not sure how much energy this is
worth.  The more detailed we make the documentation the more we have
to update the next time something changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company