Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Parallel Append implementation
Date
Msg-id CA+TgmoaJ3+KYFusQj6cqcsi_jSTt6V=3PKCit=c=t4nsZty0gw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Append implementation  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: [HACKERS] Parallel Append implementation  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Re: [HACKERS] Parallel Append implementation  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Wed, Mar 8, 2017 at 2:00 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> Yeah, that seems to be right in most of the cases. The only cases
> where your formula seems to give too few workers is for something like
> : (2, 8, 8). For such subplans, we should at least allocate 8 workers.
> It turns out that in most of the cases in my formula, the Append
> workers allocated is just 1 worker more than the max per-subplan
> worker count. So in (2, 1, 1, 8), it will be a fraction more than 8.
> So in the patch, in addition to the log2() formula you proposed, I
> have made sure that it allocates at least equal to max(per-subplan
> parallel_workers values).

Yeah, I agree with that.

Some review:

+typedef struct ParallelAppendDescData
+{
+    slock_t        pa_mutex;        /* mutual exclusion to choose
next subplan */
+    ParallelAppendInfo pa_info[FLEXIBLE_ARRAY_MEMBER];
+} ParallelAppendDescData;

Instead of having ParallelAppendInfo, how about just int
pa_workers[FLEXIBLE_ARRAY_MEMBER]?  The second structure seems like
overkill, at least for now.

+static inline void
+exec_append_scan_first(AppendState *appendstate)
+{
+    appendstate->as_whichplan = 0;
+}

I don't think this is buying you anything, and suggest backing it out.

+        /* Backward scan is not supported by parallel-aware plans */
+        Assert(!ScanDirectionIsBackward(appendstate->ps.state->es_direction));

I think you could assert ScanDirectionIsForward, couldn't you?
NoMovement, I assume, is right out.

+            elog(DEBUG2, "ParallelAppend : pid %d : all plans already
finished",
+                         MyProcPid);

Please remove (and all similar cases also).

+                 sizeof(*node->as_padesc->pa_info) * node->as_nplans);

I'd use the type name instead.

+    for (i = 0; i < node->as_nplans; i++)
+    {
+        /*
+         * Just setting all the number of workers to 0 is enough. The logic
+         * of choosing the next plan in workers will take care of everything
+         * else.
+         */
+        padesc->pa_info[i].pa_num_workers = 0;
+    }

Here I'd use memset.

+    return (min_whichplan == PA_INVALID_PLAN ? false : true);

Maybe just return (min_whichplan != PA_INVALID_PLAN);

-                                              childrel->cheapest_total_path);
+
childrel->cheapest_total_path);

Unnecessary.

+        {            partial_subpaths = accumulate_append_subpath(partial_subpaths,
  linitial(childrel->partial_pathlist));
 
+        }

Don't need to add braces.

+            /*
+             * Extract the first unparameterized, parallel-safe one among the
+             * child paths.
+             */

Can we use get_cheapest_parallel_safe_total_inner for this, from
a71f10189dc10a2fe422158a2c9409e0f77c6b9e?

+        if (rel->partial_pathlist != NIL &&
+            (Path *) linitial(rel->partial_pathlist) == subpath)
+            partial_subplans_set = bms_add_member(partial_subplans_set, i);

This seems like a scary way to figure this out.  What if we wanted to
build a parallel append subpath with some path other than the
cheapest, for some reason?  I think you ought to record the decision
that set_append_rel_pathlist makes about whether to use a partial path
or a parallel-safe path, and then just copy it over here.

-                create_append_path(grouped_rel,
-                                   paths,
-                                   NULL,
-                                   0);
+                create_append_path(grouped_rel, paths, NULL, 0);

Unnecessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] Cost model for parallel CREATE INDEX
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Cost model for parallel CREATE INDEX