Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From vignesh C
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALDaNm0v0Z6sNL2t=EMwyZt=UutVVGNZPNEx82cMMw4-Steyqg@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Parallel Inserts in CREATE TABLE AS
List pgsql-hackers
On Wed, Dec 30, 2020 at 9:25 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Dec 30, 2020 at 5:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
> > w.r.t. v17-0004-Enable-CTAS-Parallel-Inserts-For-Append.patch
> >
> > + * Push the dest receiver to Gather node when it is either at the top of the
> > + * plan or under top Append node unless it does not have any projections to do.
> >
> > I think the 'unless' should be 'if'. As can be seen from the body of the method:
> >
> > +       if (!ps->ps_ProjInfo)
> > +       {
> > +           GatherState *gstate = (GatherState *) ps;
> > +
> > +           parallel = true;
>
> Thanks. Modified it in the 0004 patch. Attaching v18 patch set. Note
> that no change in 0001 to 0003 patches from v17.
>
> Please consider v18 patch set for further review.
>

Few comments:
-       /*
-        * To allow parallel inserts, we need to ensure that they are safe to be
-        * performed in workers. We have the infrastructure to allow parallel
-        * inserts in general except for the cases where inserts generate a new
-        * CommandId (eg. inserts into a table having a foreign key column).
-        */
-       if (IsParallelWorker())
-               ereport(ERROR,
-                               (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
-                                errmsg("cannot insert tuples in a
parallel worker")));

Is it possible to add a check if it is a CTAS insert here as we do not
support insert in parallel workers from others as of now.

+       Oid                     objectid;               /* workers to
open relation/table.  */
+       /* Number of tuples inserted by all the workers. */
+       pg_atomic_uint64        processed;

We can just mention relation instead of relation/table.

+select explain_pictas(
+'create table parallel_write as select length(stringu1) from tenk1;');
+                      explain_pictas
+----------------------------------------------------------
+ Gather (actual rows=N loops=N)
+   Workers Planned: 4
+   Workers Launched: N
+ ->  Create parallel_write
+   ->  Parallel Seq Scan on tenk1 (actual rows=N loops=N)
+(5 rows)
+
+select count(*) from parallel_write;

Can we include selection of cmin, xmin for one of the test to verify
that it uses the same transaction id  in the parallel workers
something like:
select distinct(cmin,xmin) from parallel_write;

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Noah Misch
Date:
Subject: Re: Dump public schema ownership & seclabels