Re: Properly pathify the union planner - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Properly pathify the union planner
Date
Msg-id 87o7cti48f.fsf@163.com
Whole thread Raw
In response to Re: Properly pathify the union planner  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Properly pathify the union planner
List pgsql-hackers
Hi,

> * I think we should update truncate_useless_pathkeys() to account for
> the ordering requested by the query's set operation;

Nice catch.

> I'm thinking that maybe it'd be better to move the work of sorting the
> subquery's paths to the outer query level, specifically within the
> build_setop_child_paths() function, just before we stick SubqueryScanPath
> on top of the subquery's paths.  I think this is better because:
>
> 1. This minimizes the impact on subquery planning and reduces the
> footprint within the grouping_planner() function as much as possible.
>
> 2. This can help avoid the aforementioned add_path() issue because the
> two involved paths will be structured as:
>
>     cheapest_path -> subqueryscan
> and
>     cheapest_path -> sort -> subqueryscan
>
> If the two paths cost fuzzily the same and add_path() decides to keep
> the second one due to it having better pathkeys and pfree the first one,
> it would not be a problem.

This is a smart idea, it works because you create a two different
subqueryscan for the cheapest_input_path.

FWIW, I found we didn't create_sort_path during building a merge join
path, instead it just cost the sort and add it to the cost of mergejoin
path only and note this path needs a presorted data. At last during the
create_mergejoin_*plan*, it create the sort_plan really. As for the
mergeappend case, could we use the similar strategy? with this way, we
might simpliy the code to use MergeAppend node since the caller just
need to say I want to try MergeAppend with the given pathkeys without
really creating the sort by themselves. 

(Have a quick glance of initial_cost_mergejoin and
create_mergejoin_plan, looks incremental sort doesn't work with mergejoin?)

>
> To assist the discussion I've attached a diff file that includes all the
> changes above.

+ */
+static int
+pathkeys_useful_for_setop(PlannerInfo *root, List *pathkeys)
+{
+    int            n_common_pathkeys;
+
+    if (root->setop_pathkeys == NIL)
+        return 0;                /* no special setop ordering requested */
+
+    if (pathkeys == NIL)
+        return 0;                /* unordered path */
+
+    (void) pathkeys_count_contained_in(root->setop_pathkeys, pathkeys,
+                                       &n_common_pathkeys);
+
+    return n_common_pathkeys;
+}

The two if-clauses looks unnecessary, it should be handled by
pathkeys_count_contained_in already. The same issue exists in
pathkeys_useful_for_ordering as well. Attached patch fix it in master.

-- 
Best Regards
Andy Fan


Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: confusing / inefficient "need_transcoding" handling in copy
Next
From: Andres Freund
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations