Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALj2ACVijBoAed8LBGqBN-Aiox8BtebPx5CfkRS-YaUTHrvbtg@mail.gmail.com
Whole thread Raw
In response to RE: Parallel Inserts in CREATE TABLE AS  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
Responses RE: Parallel Inserts in CREATE TABLE AS  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
List pgsql-hackers
On Thu, Nov 26, 2020 at 12:15 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> I took a deep look at the projection logic.
> In most cases, you are right that Gather node does not need projection.
>
> In some rare cases, such as Subplan (or initplan I guess).
> The projection will happen in Gather node.
>
> The example:
>
> Create table test(i int);
> Create table test2(a int, b int);
> insert into test values(generate_series(1,10000000,1));
> insert into test2 values(generate_series(1,1000,1), generate_series(1,1000,1));
>
> postgres=# explain(verbose, costs off) select test.i,(select i from (select * from test2) as tt limit 1) from test
wheretest.i < 2000;
 
>                QUERY PLAN
> ----------------------------------------
>  Gather
>    Output: test.i, (SubPlan 1)
>    Workers Planned: 2
>    ->  Parallel Seq Scan on public.test
>          Output: test.i
>          Filter: (test.i < 2000)
>    SubPlan 1
>      ->  Limit
>            Output: (test.i)
>            ->  Seq Scan on public.test2
>                  Output: test.i
>
> In this case, projection is necessary,
> because the subplan will be executed in projection.
>
> If skipped, the table created will loss some data.
>

Thanks a lot for the use case. Yes with the current patch table will
lose data related to the subplan. On analyzing further, I think we can
not allow parallel inserts in the cases when the Gather node has some
projections to do. Because the workers can not perform that
projection. So, having ps_ProjInfo in the Gather node is an indication
for us to disable parallel inserts and only the leader can do the
insertions after the Gather node does the required projections.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Paul Förster
Date:
Subject: configure and DocBook XML
Next
From: Antonin Houska
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs