Re: [HACKERS] Proposal : Parallel Merge Join - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Proposal : Parallel Merge Join
Date
Msg-id CAA4eK1K=_qkc67T_M82OxdT7Au9h7WSJUgmLk5Oe88tie0Jx3w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Proposal : Parallel Merge Join  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Proposal : Parallel Merge Join  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: [HACKERS] Proposal : Parallel Merge Join  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Feb 14, 2017 at 5:22 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Tue, Feb 14, 2017 at 12:25 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Apart from this, there was one small problem in an earlier version
> which I have corrected in this.
>
> + /* Consider only parallel safe inner path */
> + if (innerpath != NULL &&
> + innerpath->parallel_safe &&
> + (cheapest_total_inner == NULL ||
> + cheapest_total_inner->parallel_safe == false ||
> + compare_path_costs(innerpath, cheapest_total_inner,
> + TOTAL_COST) < 0))
>
> In this comparison, we were only checking if innerpath is cheaper than
> the cheapest_total_inner then generate path with this new inner path
> as well. Now, I have added one more check if cheapest_total_inner was
> not parallel safe then also consider a path with this new inner
> (provided this inner is parallel safe).
>


What advantage do you see for considering such a path when the cost of
innerpath is more than cheapest_total_inner?  Remember the more paths
we try to consider, the more time we spend in the planner.  By any
chance are you able to generate any query where it will give benefit
by considering costlier innerpath?

2.
+static void generate_parallel_mergejoin_paths(PlannerInfo *root,
+  RelOptInfo *joinrel,
+  RelOptInfo *innerrel,
+  Path *outerpath,
+  JoinType jointype,
+  JoinPathExtraData *extra,
+  Path *inner_cheapest_total,
+  List *merge_pathkeys);

It is better to name this function as
generate_partial_mergejoin_paths() as we are generating only partial
paths in this function and accordingly change the comment on top of
the function.  I see that you might be naming it based on
consider_parallel_*, however, I think it is better to use partial in
the name as that is what we are doing inside that function.  Also, I
think this function has removed/changed some handling related to
unique outer and full joins, so it is better to mention that in the
function comments, something like "unlike above function, this
function doesn't expect to handle join types JOIN_UNIQUE_OUTER or
JOIN_FULL" and add Assert for both of those types.

3.
A test case is still missing.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "Okano, Naoki"
Date:
Subject: Re: [HACKERS] Keep ECPG comment for log_min_duration_statement
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)