Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] Parallel Append implementation
Date
Msg-id CAFjFpRe1AvJS+xJUs6iXetw0LvVg86g7SW4ocHw9S7oWmCTViQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Append implementation  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Parallel Append implementation  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Fri, Mar 10, 2017 at 11:33 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>>
>> But as far as code is concerned, I think the two-list approach will
>> turn out to be less simple if we derive corresponding two different
>> arrays in AppendState node. Handling two different arrays during
>> execution does not look clean. Whereas, the bitmapset that I have used
>> in Append has turned out to be very simple. I just had to do the below
>> check (and that is the only location) to see if it's a partial or
>> non-partial subplan. There is nowhere else any special handling for
>> non-partial subpath.
>>
>> /*
>> * Increment worker count for the chosen node, if at all we found one.
>> * For non-partial plans, set it to -1 instead, so that no other workers
>> * run it.
>> */
>> if (min_whichplan != PA_INVALID_PLAN)
>> {
>>    if (bms_is_member(min_whichplan,
>> ((Append*)state->ps.plan)->partial_subplans_set))
>>            padesc->pa_info[min_whichplan].pa_num_workers++;
>>    else
>>            padesc->pa_info[min_whichplan].pa_num_workers = -1;
>> }
>>
>> Now, since Bitmapset field is used during execution with such
>> simplicity, why not have this same data structure in AppendPath, and
>> re-use bitmapset field in Append plan node without making a copy of
>> it. Otherwise, if we have two lists in AppendPath, and a bitmap in
>> Append, again there is going to be code for data structure conversion.
>>
>
> I think there is some merit in separating out non-parallel and
> parallel plans within the same array or outside it. The current logic
> to assign plan to a worker looks at all the plans, unnecessarily
> hopping over the un-parallel ones after they are given to a worker. If
> we separate those two, we can keep assigning new workers to the
> non-parallel plans first and then iterate over the parallel ones when
> a worker needs a plan to execute. We might eliminate the need for
> special value -1 for num workers. You may separate those two kinds in
> two different arrays or within the same array and remember the
> smallest index of a parallel plan.

Further to that, with this scheme and the scheme to distribute workers
equally irrespective of the maximum workers per plan, you don't need
to "scan" the subplans to find the one with minimum workers. If you
treat the array of parallel plans as a circular queue, the plan to be
assigned next to a worker will always be the plan next to the one
which got assigned to the given worker. Once you have assigned workers
to non-parallel plans, intialize a shared variable next_plan to point
to the first parallel plan. When a worker comes asking for a plan,
assign the plan pointed by next_plan and update it to the next plan in
the circular queue.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
Next
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Adding support for Default partition in partitioning