Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Parallel Inserts in CREATE TABLE AS |
Date | |
Msg-id | CALj2ACWsRC9O+bpyEEgAg6NGRU7e7-c2jPE8vgZ5iW9TKfEVDw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Inserts in CREATE TABLE AS (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: Parallel Inserts in CREATE TABLE AS
(vignesh C <vignesh21@gmail.com>)
|
List | pgsql-hackers |
On Wed, Dec 30, 2020 at 5:28 PM vignesh C <vignesh21@gmail.com> wrote: > Few comments: > - /* > - * To allow parallel inserts, we need to ensure that they are safe to be > - * performed in workers. We have the infrastructure to allow parallel > - * inserts in general except for the cases where inserts generate a new > - * CommandId (eg. inserts into a table having a foreign key column). > - */ > - if (IsParallelWorker()) > - ereport(ERROR, > - (errcode(ERRCODE_INVALID_TRANSACTION_STATE), > - errmsg("cannot insert tuples in a > parallel worker"))); > > Is it possible to add a check if it is a CTAS insert here as we do not > support insert in parallel workers from others as of now. Currently, there's no global variable in which we can selectively skip this in case of parallel insertion in CTAS. How about having a variable in any of the worker global contexts, set that when parallel insertion is chosen for CTAS and use that in heap_prepare_insert() to skip the above error? Eventually, we can remove this restriction entirely in case we fully allow parallelism for INSERT INTO SELECT, CTAS, and COPY. Thoughts? > + Oid objectid; /* workers to > open relation/table. */ > + /* Number of tuples inserted by all the workers. */ > + pg_atomic_uint64 processed; > > We can just mention relation instead of relation/table. I will modify it in the next patch set. > +select explain_pictas( > +'create table parallel_write as select length(stringu1) from tenk1;'); > + explain_pictas > +---------------------------------------------------------- > + Gather (actual rows=N loops=N) > + Workers Planned: 4 > + Workers Launched: N > + -> Create parallel_write > + -> Parallel Seq Scan on tenk1 (actual rows=N loops=N) > +(5 rows) > + > +select count(*) from parallel_write; > > Can we include selection of cmin, xmin for one of the test to verify > that it uses the same transaction id in the parallel workers > something like: > select distinct(cmin,xmin) from parallel_write; This is not possible since cmin and xmin are dynamic, we can not use them in test cases. I think it's not necessary to check whether the leader and workers are in the same txn or not, since we are not creating a new txn. All the txn state from the leader is serialized in SerializeTransactionState and restored in StartParallelWorkerTransaction. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: