RE: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From Hou, Zhijie
Subject RE: Parallel Inserts in CREATE TABLE AS
Date
Msg-id cba3db30f802466ba257a26515cd75ed@G08CNEXMBPEKD05.g08.fujitsu.local
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
Hi,

I'm very interested in this feature,
and I'm looking at the patch, here are some comments.

1.
+            if (!TupIsNull(outerTupleSlot))
+            {
+                (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest);
+                node->ps.state->es_processed++;
+            }
+
+            if(TupIsNull(outerTupleSlot))
+                break;
+        }

How about the following style:

        if(TupIsNull(outerTupleSlot))
            Break;
        
        (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest);
        node->ps.state->es_processed++;

Which looks cleaner.


2.
+
+    if (into != NULL &&
+        IsA(into, IntoClause))
+    {

The check can be replaced by ISCTAS(into).


3.
+    /*
+     * For parallelizing inserts in CTAS i.e. making each
+     * parallel worker inerst it's tuples, we must send
+     * information such as intoclause(for each worker

'inerst' looks like a typo (insert).


4.
+    /* Estimate space for into clause for CTAS. */
+    if (ISCTAS(planstate->intoclause))
+    {
+        intoclausestr = nodeToString(planstate->intoclause);
+        shm_toc_estimate_chunk(&pcxt->estimator, strlen(intoclausestr) + 1);
+        shm_toc_estimate_keys(&pcxt->estimator, 1);
+    }
...
+    if (intoclausestr != NULL)
+    {
+        char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+                                                strlen(intoclausestr) + 1);
+        strcpy(shmptr, intoclausestr);
+        shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+    }

The code here call strlen(intoclausestr) for two times,
After checking the existing code in ExecInitParallelPlan,
It used to store the strlen in a variable.

So how about the following style:

    intoclause_len = strlen(intoclausestr);
    ...
    /* Store serialized intoclause. */
    intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1);
    memcpy(shmptr, intoclausestr, intoclause_len + 1);
    shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space);

the code in ExecInitParallelPlan 


5.
+    if (intoclausestr != NULL)
+    {
+        char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+                                                strlen(intoclausestr) + 1);
+        strcpy(shmptr, intoclausestr);
+        shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+    }
+
     /* Set up the tuple queues that the workers will write into. */
-    pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);
+    if (intoclausestr == NULL)
+        pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);

The two check about intoclausestr seems can be combined like:

if (intoclausestr != NULL)
{
...
}
else
{
...
}

Best regards,
houzj





pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2
Next
From: Li Japin
Date:
Subject: Remove cache_plan argument comments to ri_PlanCheck