Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: [HACKERS] Parallel Append implementation
Date
Msg-id CAJ3gD9f5kTa_pQUCisunMbQi8MZbn_oPv+G-_te57+e3o+8UDw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Append implementation  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Parallel Append implementation  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
After giving more thought to our discussions, I have have used the
Bitmapset structure in AppendPath as against having two lists one for
partial and other for non-partial paths. Attached is the patch v6 that
has the required changes. So accumulate_append_subpath() now also
prepares the bitmapset containing the information about which paths
are partial paths. This is what I had done in the first version.

At this point of time, I have not given sufficient time to think about
Ashutosh's proposal of just keeping track of the next_subplan which he
mentioned. There, we just keep assigning workers to a circle of
subplans in round-robin style. But I think as of now the approach of
choosing the minimum worker subplan is pretty simple looking. So the
patch v6 is in a working condition using minimum-worker approach.

On 9 March 2017 at 07:22, Robert Haas <robertmhaas@gmail.com> wrote:

> Some review:
>
> +typedef struct ParallelAppendDescData
> +{
> +    slock_t        pa_mutex;        /* mutual exclusion to choose
> next subplan */
> +    ParallelAppendInfo pa_info[FLEXIBLE_ARRAY_MEMBER];
> +} ParallelAppendDescData;
>
> Instead of having ParallelAppendInfo, how about just int
> pa_workers[FLEXIBLE_ARRAY_MEMBER]?  The second structure seems like
> overkill, at least for now.

I have , for now, kept the structure there, just in case after further
discussion we may add something.

>
> +static inline void
> +exec_append_scan_first(AppendState *appendstate)
> +{
> +    appendstate->as_whichplan = 0;
> +}
>
> I don't think this is buying you anything, and suggest backing it out.

This is required for sequential Append, so that we can start executing
from the first subplan.

>
> +        /* Backward scan is not supported by parallel-aware plans */
> +        Assert(!ScanDirectionIsBackward(appendstate->ps.state->es_direction));
>
> I think you could assert ScanDirectionIsForward, couldn't you?
> NoMovement, I assume, is right out.

Right. Changed.

>
> +            elog(DEBUG2, "ParallelAppend : pid %d : all plans already
> finished",
> +                         MyProcPid);
>
> Please remove (and all similar cases also).

Removed at multiple places.

>
> +                 sizeof(*node->as_padesc->pa_info) * node->as_nplans);
>
> I'd use the type name instead.

Done.

>
> +    for (i = 0; i < node->as_nplans; i++)
> +    {
> +        /*
> +         * Just setting all the number of workers to 0 is enough. The logic
> +         * of choosing the next plan in workers will take care of everything
> +         * else.
> +         */
> +        padesc->pa_info[i].pa_num_workers = 0;
> +    }
>
> Here I'd use memset.

Done.

>
> +    return (min_whichplan == PA_INVALID_PLAN ? false : true);
>
> Maybe just return (min_whichplan != PA_INVALID_PLAN);

Done.

>
> -                                              childrel->cheapest_total_path);
> +
> childrel->cheapest_total_path);
>
> Unnecessary.

This call is now having more param, so kept the change.
>
> +        {
>              partial_subpaths = accumulate_append_subpath(partial_subpaths,
>                                         linitial(childrel->partial_pathlist));
> +        }
>
> Don't need to add braces.

Removed them.

>
> +            /*
> +             * Extract the first unparameterized, parallel-safe one among the
> +             * child paths.
> +             */
>
> Can we use get_cheapest_parallel_safe_total_inner for this, from
> a71f10189dc10a2fe422158a2c9409e0f77c6b9e?

Yes, Fixed.

>
> +        if (rel->partial_pathlist != NIL &&
> +            (Path *) linitial(rel->partial_pathlist) == subpath)
> +            partial_subplans_set = bms_add_member(partial_subplans_set, i);
>
> This seems like a scary way to figure this out.  What if we wanted to
> build a parallel append subpath with some path other than the
> cheapest, for some reason?  I think you ought to record the decision
> that set_append_rel_pathlist makes about whether to use a partial path
> or a parallel-safe path, and then just copy it over here.

As mentioned above, used Bitmapset in AppendPath.

>
> -                create_append_path(grouped_rel,
> -                                   paths,
> -                                   NULL,
> -                                   0);
> +                create_append_path(grouped_rel, paths, NULL, 0);
>
> Unnecessary.

Now since there was anyway a change in the number of params, I kept
the single line call.

Please refer to attached patch version v6 for all of the above changes.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] Should we cacheline align PGXACT?
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] PATCH: Configurable file mode mask