Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Parallel Append implementation |
Date | |
Msg-id | CAA4eK1++-H=04TjeFKAkbDZ4TL0W092ryAqf8mrjU+pTQUS8+A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Parallel Append implementation (Amit Khandekar <amitdkhan.pg@gmail.com>) |
Responses |
Re: [HACKERS] Parallel Append implementation
Re: [HACKERS] Parallel Append implementation |
List | pgsql-hackers |
On Mon, Sep 11, 2017 at 4:49 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > On 8 September 2017 at 19:17, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> In that case, why can't we keep the workers also process in same >> order, what is the harm in that? > > Because of the way the logic of queuing works, the workers finish > earlier if they start with expensive plans first. For e.g. : with 3 > plans with costs 8, 4, 4 and with 2 workers w1 and w2, they will > finish in 8 time units (w1 will finish plan 1 in 8, while in parallel > w2 will finish the remaining 2 plans in 8 units. Whereas if the plans > are ordered like : 4, 4, 8, then the workers will finish in 12 time > units (w1 and w2 will finish each of the 1st two plans in 4 units, and > then w1 or w2 will take up plan 3 and finish in 8 units, while the > other worker remains idle). > I think the patch stores only non-partial paths in decreasing order, what if partial paths having more costs follows those paths? > >> >> Few more comments: >> >> 1. >> + else if (IsA(subpath, MergeAppendPath)) >> + { >> + MergeAppendPath *mpath = (MergeAppendPath *) subpath; >> + >> + /* >> + * If at all MergeAppend is partial, all its child plans have to be >> + * partial : we don't currently support a mix of partial and >> + * non-partial MergeAppend subpaths. >> + */ >> + if (is_partial) >> + return list_concat(partial_subpaths, list_copy(mpath->subpaths)); >> >> In which situation partial MergeAppendPath is generated? Can you >> provide one example of such path? > > Actually currently we don't support partial paths for MergeAppendPath. > That code just has that if condition (is_partial) but currently that > condition won't be true for MergeAppendPath. > I think then it is better to have Assert for MergeAppend. It is generally not a good idea to add code which we can never hit. >> >> 2. >> add_paths_to_append_rel() .. >> >> I think it might be better to add a sentence why we choose a different >> way to decide a number of workers in the second case >> (non-parallel-aware append). > > Yes, I agree. Will do that after we conclude with your next point below ... > >> Do you think non-parallel-aware Append >> will be better in any case when there is a parallel-aware append? I >> mean to say let's try to create non-parallel-aware append only when >> parallel-aware append is not possible. > > By non-parallel-aware append, I am assuming you meant partial > non-parallel-aware Append. Yes, if the parallel-aware Append path has > *all* partial subpaths chosen, then we do omit a partial non-parallel > Append path, as seen in this code in the patch : > > /* > * Consider non-parallel partial append path. But if the parallel append > * path is made out of all partial subpaths, don't create another partial > * path; we will keep only the parallel append path in that case. > */ > if (partial_subpaths_valid && !pa_all_partial_subpaths) > { > ...... > } > > But if the parallel-Append path has a mix of partial and non-partial > subpaths, then we can't really tell which of the two could be cheapest > until we calculate the cost. It can be that the non-parallel-aware > partial Append can be cheaper as well. > How? See, if you have four partial subpaths and two non-partial subpaths, then for parallel-aware append it considers all six paths in parallel path whereas for non-parallel-aware append it will consider just four paths and that too with sub-optimal strategy. Can you please try to give me some example so that it will be clear. >> >> 4. >> - select count(*) from a_star; >> -select count(*) from a_star; >> + select round(avg(aa)), sum(aa) from a_star; >> +select round(avg(aa)), sum(aa) from a_star; >> >> Why you have changed the existing test. It seems count(*) will also >> give what you are expecting. > > Needed to do cover some data testing with Parallel Append execution. > Okay. One more thing, I think you might want to update comment atop add_paths_to_append_rel to reflect the new type of parallel-aware append path being generated. In particular, I am referring to below part of the comment: * Similarly it collects partial paths from* non-dummy children to create partial append paths.*/ static void add_paths_to_append_rel() -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: