Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From vignesh C
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALDaNm1XpwEoHS9U_zZ2GPSZ_qKZAc=VSa4VTO66J6k5Gzr=8Q@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Parallel Inserts in CREATE TABLE AS
Re: Parallel Inserts in CREATE TABLE AS
List pgsql-hackers
On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy
> Attaching v14 patch set that has above changes. Please consider this
> for further review.
>

Few comments:
In the below case, should create be above Gather?
postgres=# explain  create table t7 as select * from t6;
                            QUERY PLAN
-------------------------------------------------------------------
 Gather  (cost=0.00..9.17 rows=0 width=4)
   Workers Planned: 2
 ->  Create t7
   ->  Parallel Seq Scan on t6  (cost=0.00..9.17 rows=417 width=4)
(4 rows)

Can we change it to something like:
-------------------------------------------------------------------
Create t7
 -> Gather  (cost=0.00..9.17 rows=0 width=4)
  Workers Planned: 2
  ->  Parallel Seq Scan on t6  (cost=0.00..9.17 rows=417 width=4)
(4 rows)

You could change intoclause_len = strlen(intoclausestr) to
strlen(intoclausestr) + 1 and use intoclause_len in the remaining
places. We can avoid the +1 in the other places.
+       /* Estimate space for into clause for CTAS. */
+       if (IS_CTAS(intoclause) && OidIsValid(objectid))
+       {
+               intoclausestr = nodeToString(intoclause);
+               intoclause_len = strlen(intoclausestr);
+               shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + 1);
+               shm_toc_estimate_keys(&pcxt->estimator, 1);
+       }

Can we use  node->nworkers_launched == 0 in place of
node->need_to_scan_locally, that way the setting and resetting of
node->need_to_scan_locally can be removed. Unless need_to_scan_locally
is needed in any of the functions that gets called.
+       /* Enable leader to insert in case no parallel workers were launched. */
+       if (node->nworkers_launched == 0)
+               node->need_to_scan_locally = true;
+
+       /*
+        * By now, for parallel workers (if launched any), would have
started their
+        * work i.e. insertion to target table. In case the leader is chosen to
+        * participate for parallel inserts in CTAS, then finish its
share before
+        * going to wait for the parallel workers to finish.
+        */
+       if (node->need_to_scan_locally)
+       {

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs
Next
From: Li Japin
Date:
Subject: Cannot ship records to subscriber for partition tables using logical replication (publish_via_partition_root=false)