Using indices for UNION. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Using indices for UNION.
Date
Msg-id 20131031.194251.12577697.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
Responses Re: Using indices for UNION.  (Peter Eisentraut <peter_e@gmx.net>)
Re: Using indices for UNION.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Hello, This patch might be too complicated (and seems somewhat ad
hoc) for the gain, but more index usage for this kind of
operation should be worth doing.

Currently, PostgreSQL ignores from the very first the availablity
of indexes for UNION. Sorting and SeqScan is choosed as follows,

* EX.1
> uniontest=# create table c11 (a int, b int, c int, d text);
> uniontest=# create unique index cu11_pkey_idx on c11 (a, b)
> uniontest=# create table c12 (like cu11 including all);
> uniontest=# insert into cu11 (select a / 5, 4 - (a % 5), a, 'cu11'
>                               from generate_series(000000, 099999) a);
> uniontest=# insert into cu12 (select a / 5, 4 - (a % 5), a, 'cu12'
>                               from generate_series(100000, 199999) a);
> uniontest=# explain (analyze on, costs off)
>   select a, b from cu11 union select a, b from cu12 order by a, b limit 100;
>                            QUERY PLAN
> ---------------------------------------------------------------------
> Limit (actual time=287.611..287.673 rows=100 loops=1)
>  ->  Unique (actual time=287.610..287.652 rows=100 loops=1)
>   ->  Sort (actual time=287.609..287.621 rows=100 loops=1)
>    Sort Key: cu11.a, cu11.b
>    Sort Method: external sort  Disk: 3520kB
>    ->  Append (actual time=0.020..73.859 rows=200000 loops=1)
>     ->  Seq Scan on cu11 (actual time=0.019..28.780 rows=100000 loops=1)
>     ->  Seq Scan on cu12 (actual time=0.006..24.550 rows=100000 loops=1)
> Total runtime: 288.596 ms

Actually, index paths are sometimes worth considering on UNION
planning especially with LIMIT. For example,

* EX.2
> uniontest=# explain (analyze on, costs off)
>             select a, b from cu11 union select a, b from cu12
>             order by a desc, b desc limit 100;
>                               QUERY PLAN
> ------------------------------------------------------------------
> Limit (actual time=0.041..0.390 rows=100 loops=1)
>  ->  Unique (actual time=0.040..0.355 rows=100 loops=1)
>   ->  Merge Append (actual time=0.040..0.290 rows=100 loops=1)
>       Sort Key: cu11.a, cu11.b
>    ->  Limit (actual time=0.025..0.025 rows=1 loops=1)
>     ->  Unique (actual time=0.025..0.025 rows=1 loops=1)
>      ->  Index Only Scan Backward on cu11
>                              (actual time=0.022..0.022 rows=1 loops=1)
>            Heap Fetches: 1
>    ->  Limit (actual time=0.012..0.209 rows=100 loops=1)
>     ->  Unique (actual time=0.011..0.188 rows=100 loops=1)
>       ->  Index Only Scan Backward .. on cu12
>                              (actual time=0.010..0.115 rows=100 loops=1)
>                                Heap Fetches: 100
> Total runtime: 1.191 ms

This is archieved by this patch.

Rough explanaiton of what this patch does is following,
- Consider whether ORDER BY or grouping of setops can be pushed  down onto leaf subqueries. (recurse_set_operations)
- Merging two children of a setop counting the sort orders of  them. Use MergeAppend if they are in the same ordering.
(generate_union_plan,recurse_union_children)
 
- Grouping is determined consulting sorting order. This will  allow removing redundant sorting.
(generate_setop_grouplist)

The effects of these items are shown in the Ex.2.

What do you think about this?


=======

Nevertheless, only this patch does not effective so far on the
query like following,

>uniontest=# explain (analyze on, costs off)
>            select * from cu11 union select * from cu12
>            order by a, b limit 100;
>                              QUERY PLAN
> -----------------------------------------------------------------------
> Limit (actual time=256.263..256.346 rows=100 loops=1)
>  ->  Unique (actual time=256.262..256.332 rows=100 loops=1)
>   ->  Sort (actual time=256.261..256.283 rows=100 loops=1)
>       Sort Key: cu11.a, cu11.b, cu11.c, cu11.d
>       Sort Method: external sort  Disk: 5280kB
>    ->  Append (actual time=0.028..61.161 rows=200000 loops=1)
>     ->  Seq Scan on cu11 (actual time=0.027..21.067 rows=100000 loops=1)
>     ->  Seq Scan on cu12 (actual time=0.012..18.428 rows=100000 loops=1)
> Total runtime: 257.778 ms

The indexes made here is UNIQUE one so they should be used for
ordering in this query. This will be overcome by another patch
(will separately proposed).

> uniontest=# explain (analyze on, costs off)
>             select * from cu11 union select * from cu12
>             order by a, b limit 100;
>                              QUERY PLAN
> ---------------------------------------------------------------------------
> Limit (actual time=0.102..0.351 rows=100 loops=1)
>  ->  Unique (actual time=0.100..0.323 rows=100 loops=1)
>   ->  Merge Append (actual time=0.097..0.222 rows=100 loops=1)
>       Sort Key: cu11.a, cu11.b, cu11.c, cu11.d
>    ->  Limit (actual time=0.051..0.135 rows=100 loops=1)
>     ->  Index Scan .. on cu11 (actual time=0.051..0.109 rows=100 loops=1)
>    ->  Limit (actual time=0.044..0.044 rows=1 loops=1)
>     ->  Index Scan .. on cu12 (actual time=0.044..0.044 rows=1 loops=1)
> Total runtime: 1.244 ms

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d8aa35d..86abdf6 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1063,15 +1063,6 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)        List       *set_sortclauses;
    /*
 
-         * If there's a top-level ORDER BY, assume we have to fetch all the
-         * tuples.    This might be too simplistic given all the hackery below
-         * to possibly avoid the sort; but the odds of accurate estimates here
-         * are pretty low anyway.
-         */
-        if (parse->sortClause)
-            tuple_fraction = 0.0;
-
-        /*         * Construct the plan for set operations.  The result will not need         * any work except
perhapsa top-level sort and/or LIMIT.  Note that         * any special work for recursive unions is the responsibility
of
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index b78d727..efb50dc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -778,9 +778,26 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)                Assert(splan->plan.qual
==NIL);                foreach(l, splan->appendplans)                {
 
-                    lfirst(l) = set_plan_refs(root,
-                                              (Plan *) lfirst(l),
-                                              rtoffset);
+                    Append *newp =
+                        set_plan_refs(root,
+                                      (Plan *) lfirst(l),
+                                      rtoffset);
+                    /*
+                     * UNION on inherited tables may create directly nested
+                     * Appends in plan tree. This structure can be flatten by
+                     * taking grandchildren into parent.
+                     */
+                    if (IsA(newp, Append) && 
+                        list_length(newp->appendplans) > 0)
+                    {
+                        ListCell *plc = list_head(newp->appendplans);
+                        lfirst(l) = lfirst(plc);
+                        for_each_cell(plc, lnext(plc))
+                            l = lappend_cell(splan->appendplans,
+                                             l, lfirst(plc));
+                    }
+                    else
+                        lfirst(l) = newp;                }            }            break;
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index e249628..b814a72 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -29,6 +29,7 @@#include "postgres.h"#include <limits.h>
+#include <math.h>#include "access/heapam.h"#include "access/htup_details.h"
@@ -60,6 +61,7 @@ typedef structstatic Plan *recurse_set_operations(Node *setOp, PlannerInfo *root,
 double tuple_fraction,                       List *colTypes, List *colCollations,
 
+                       List *groupClauses,                       bool junkOK,                       int flag, List
*refnames_tlist,                      List **sortClauses, double *pNumGroups);
 
@@ -78,7 +80,8 @@ static Plan *generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,static List
*recurse_union_children(Node*setOp, PlannerInfo *root,                       double tuple_fraction,
 SetOperationStmt *top_union,
 
-                       List *refnames_tlist);
+                       List *refnames_tlist,
+                       List **child_sortclause);static Plan *make_union_unique(SetOperationStmt *op, Plan *plan,
          PlannerInfo *root, double tuple_fraction,                  List **sortClauses);
 
@@ -97,7 +100,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations,                      bool
flag,                     List *input_plans,                      List *refnames_tlist);
 
-static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist);
+static List *generate_setop_grouplist(List *groupClauses, List *targetlist,
+                                      List *sortClauses);static void expand_inherited_rtentry(PlannerInfo *root,
RangeTblEntry*rte,                         Index rti);static void make_inh_translation_list(Relation oldrelation,
 
@@ -152,6 +156,15 @@ plan_set_operations(PlannerInfo *root, double tuple_fraction,    Assert(parse->distinctClause ==
NIL);   /*
 
+     * If there's a top-level ORDER BY, assume we have to fetch all the tuples
+     * except for UNION. This might be too simplistic given all the hackery
+     * below to possibly avoid the sort; but the odds of accurate estimates
+     * here are pretty low anyway.
+     */
+    if (parse->sortClause && topop->op != SETOP_UNION)
+        tuple_fraction = 0.0;
+
+    /*     * We'll need to build RelOptInfos for each of the leaf subqueries, which     * are RTE_SUBQUERY rangetable
entriesin this Query.  Prepare the index     * arrays for that.
 
@@ -186,18 +199,49 @@ plan_set_operations(PlannerInfo *root, double tuple_fraction,     */    return
recurse_set_operations((Node*) topop, root, tuple_fraction,                                  topop->colTypes,
topop->colCollations,
+                                  topop->groupClauses,                                  true, -1,
           leftmostQuery->targetList,                                  sortClauses, NULL);}/*
 
+ * save_plannerglobal
+ *
+ * save planner global to allow multiple calls of subquery_planner at the same
+ * global status. This is done apartly from copyObject so as to do medium
+ * shallow copying.
+ */
+static PlannerGlobal *
+save_plannerglobal(const PlannerGlobal *in)
+{
+    PlannerGlobal *save = makeNode(PlannerGlobal);
+
+    save->boundParams        = in->boundParams;
+    save->subplans            = list_copy(in->subplans);
+    save->subroots            = list_copy(in->subroots);
+    save->rewindPlanIDs        = bms_copy(in->rewindPlanIDs);
+    save->finalrtable        = list_copy(in->finalrtable);
+    save->finalrowmarks        = list_copy(in->finalrowmarks); 
+    save->resultRelations    = list_copy(in->resultRelations);
+    save->relationOids        = list_copy(in->relationOids);
+    save->invalItems        = list_copy(in->invalItems);
+    save->nParamExec        = in->nParamExec;
+    save->lastPHId            = in->lastPHId;
+    save->lastRowMarkId        = in->lastRowMarkId;
+    save->transientPlan        = in->transientPlan;
+
+    return save;
+}
+
+/* * recurse_set_operations *      Recursively handle one step in a tree of set operations * * tuple_fraction:
fractionof tuples we expect to retrieve from node * colTypes: OID list of set-op's result column datatypes *
colCollations:OID list of set-op's result column collations
 
+ * groupClauses: parent setop's grouping clause. * junkOK: if true, child resjunk columns may be left in the result *
flag:if >= 0, add a resjunk output column indicating value of flag * refnames_tlist: targetlist to take column names
from
@@ -215,6 +259,7 @@ static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root,                       double
tuple_fraction,                      List *colTypes, List *colCollations,
 
+                       List *groupClauses,                       bool junkOK,                       int flag, List
*refnames_tlist,                      List **sortClauses, double *pNumGroups)
 
@@ -228,9 +273,12 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,        PlannerInfo *subroot;        Plan
  *subplan,                   *plan;
 
+        PlannerGlobal *glob_org;        Assert(subquery != NULL);
+        *sortClauses = NIL;
+        /*         * We need to build a RelOptInfo for each leaf subquery.  This isn't         * used for anything
here,but it carries the subroot data structures
 
@@ -243,12 +291,103 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,        /*         * Generate plan for
primitivesubquery
 
+         *
+         * subquery_planner modifies planner globals. Save current planner
+         * global in order to allow next calling subquery_planner.         */
+        glob_org = save_plannerglobal(root->glob);        subplan = subquery_planner(root->glob, subquery,
                     root,                                   false, tuple_fraction,
&subroot);
+        /*
+         * Consider whether ORDER BY or groupings of the parent set operation
+         * can be pushed down onto this subquery.
+         */
+        if(tuple_fraction >= 1 &&
+           !subquery->limitOffset && !subquery->limitCount &&
+           !subquery->distinctClause &&
+           (!subquery->sortClause ||
+            equal(subquery->sortClause, root->parse->sortClause)))
+        {
+            ListCell *s, *d;
+            bool    try_pushdown = true;
+
+            /*
+             * Index scan is expected to be applied in the subquery by this
+             * pushdown. So it is useless when target list contains non-Var
+             * member or types of exprs do not match because index scan won't
+             * be available. The reason why this check is here is to avoid
+             * unnecessary copyObject for the whole subquery which would be
+             * rather heavier.
+             */
+            forboth(s, root->parse->targetList, d, subquery->targetList)
+            {
+                TargetEntry *ste = (TargetEntry*) lfirst(s);
+                TargetEntry *dte = (TargetEntry*) lfirst(d);
+                Node *sexpr = (Node*)ste->expr;
+                Node *dexpr = (Node*)dte->expr;
+                if (!IsA(sexpr, Var) || !IsA(dexpr, Var) ||
+                    exprType(sexpr) != exprType(dexpr))
+                {
+                    try_pushdown = false;
+                    break;
+                }
+            }
+
+            if (try_pushdown)
+            {
+                Query  *pdquery;           /* 'pd' comes from 'pushed down'.  */
+                PlannerInfo *pdsubroot;
+                Plan   *pdplan;
+
+                pdquery = copyObject(subquery);
+
+                if (!pdquery->sortClause)
+                    pdquery->sortClause = root->parse->sortClause;
+
+                Assert(!pdquery->distinctClause);
+                pdquery->distinctClause =
+                    generate_setop_grouplist(groupClauses,
+                                             pdquery->targetList,
+                                             pdquery->sortClause);
+
+                if (tuple_fraction >= 1)
+                    pdquery->limitCount =
+                        (Node*)makeConst(INT8OID, -1, InvalidOid, sizeof(int64),
+                                         Int64GetDatum(tuple_fraction),
+                                         false, FLOAT8PASSBYVAL);
+
+                pdplan = subquery_planner(glob_org, pdquery,
+                                          root,
+                                          false, tuple_fraction,
+                                          &pdsubroot);
+
+                if (pdplan->total_cost < subplan->total_cost)
+                {
+                    subplan = pdplan;
+                    subroot = pdsubroot;
+                    /*
+                     * Glob cannot be moved because this is referred to from
+                     * many places by its address. So set the address of the
+                     * original glob to subroot, then copy new values there.
+                     */
+                    subroot->glob = root->glob;
+                    *root->glob = *glob_org;
+
+                    /*
+                     * This plan is sorted on sortClause if any, else sorted
+                     * on distinctClause.
+                     */
+                    if (pdquery->sortClause)
+                        *sortClauses = pdquery->sortClause;
+                    else
+                        *sortClauses = pdquery->distinctClause;
+                }
+            }
+        }
+        /* Save subroot and subplan in RelOptInfo for setrefs.c */        rel->subplan = subplan;        rel->subroot
=subroot;
 
@@ -290,12 +429,6 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,                              rtr->rtindex,
                           subplan);
 
-        /*
-         * We don't bother to determine the subquery's output ordering since
-         * it won't be reflected in the set-op result anyhow.
-         */
-        *sortClauses = NIL;
-        return plan;    }    else if (IsA(setOp, SetOperationStmt))
@@ -379,12 +512,14 @@ generate_recursion_plan(SetOperationStmt *setOp, PlannerInfo *root,     */    lplan =
recurse_set_operations(setOp->larg,root, tuple_fraction,                                   setOp->colTypes,
setOp->colCollations,
+                                   setOp->groupClauses,                                   false, -1,
               refnames_tlist, sortClauses, NULL);    /* The right plan will want to look at the left one ... */
root->non_recursive_plan= lplan;    rplan = recurse_set_operations(setOp->rarg, root, tuple_fraction,
               setOp->colTypes, setOp->colCollations,
 
+                                   setOp->groupClauses,                                   false, -1,
               refnames_tlist, sortClauses, NULL);    root->non_recursive_plan = NULL;
 
@@ -409,7 +544,8 @@ generate_recursion_plan(SetOperationStmt *setOp, PlannerInfo *root,        double
dNumGroups;       /* Identify the grouping semantics */
 
-        groupList = generate_setop_grouplist(setOp, tlist);
+        groupList =
+            generate_setop_grouplist(setOp->groupClauses, tlist, NULL);        /* We only support hashing here */
 if (!grouping_is_hashable(groupList))
 
@@ -452,20 +588,7 @@ generate_union_plan(SetOperationStmt *op, PlannerInfo *root,    List       *planlist;    List
*tlist;    Plan       *plan;
 
-
-    /*
-     * If plain UNION, tell children to fetch all tuples.
-     *
-     * Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
-     * each arm of the UNION ALL.  One could make a case for reducing the
-     * tuple fraction for later arms (discounting by the expected size of the
-     * earlier arms' results) but it seems not worth the trouble. The normal
-     * case where tuple_fraction isn't already zero is a LIMIT at top level,
-     * and passing it down as-is is usually enough to get the desired result
-     * of preferring fast-start plans.
-     */
-    if (!op->all)
-        tuple_fraction = 0.0;
+    List       *lsort, *rsort;    /*     * If any of my children are identical UNION nodes (same op, all-flag, and
@@ -475,34 +598,99 @@ generate_union_plan(SetOperationStmt *op, PlannerInfo *root,     */    planlist =
list_concat(recurse_union_children(op->larg,root,                                                  tuple_fraction,
 
-                                                  op, refnames_tlist),
+                                                  op, refnames_tlist,
+                                                  &lsort),                           recurse_union_children(op->rarg,
root,                                                 tuple_fraction,
 
-                                                  op, refnames_tlist));
+                                                  op, refnames_tlist,
+                                                  &rsort));    /*
-     * Generate tlist for Append plan node.
+     * Generate tlist for Append/MergeAppend plan node.     *
-     * The tlist for an Append plan isn't important as far as the Append is
-     * concerned, but we must make it look real anyway for the benefit of the
-     * next plan level up.
+     * The tlist for an Append/MergeAppend plan isn't important as far as the
+     * they are concerned, but we must make it look real anyway for the
+     * benefit of the next plan level up.     */    tlist = generate_append_tlist(op->colTypes, op->colCollations,
false,                                 planlist, refnames_tlist);
 
-    /*
-     * Append the child results together.
-     */
-    plan = (Plan *) make_append(planlist, tlist);
-
-    /*
-     * For UNION ALL, we just need the Append plan.  For UNION, need to add
-     * node(s) to remove duplicates.
-     */
-    if (op->all)
-        *sortClauses = NIL;        /* result of UNION ALL is always unsorted */
+    if (lsort != NIL && equal(lsort, rsort))
+    {
+        /*
+         * Generate MergeAppend plan if both children are sorted on the same
+         * sort clause or groupClauses.
+         */
+        ListCell *lc, *slc;
+        int i = 0;
+        double total_size;
+        Plan *p;
+        List *distinctClause;
+
+        MergeAppend *node = makeNode(MergeAppend);
+        node->plan.targetlist = tlist;
+        node->plan.qual = NIL;
+        node->plan.lefttree = NULL;
+        node->plan.righttree = NULL;
+        node->mergeplans = planlist;
+        node->numCols = list_length(root->parse->targetList);
+        node->sortColIdx    = (AttrNumber*)palloc(sizeof(AttrNumber) * node->numCols);
+        node->sortOperators = (Oid*)palloc(sizeof(Oid) * node->numCols);
+        node->collations    = (Oid*)palloc(sizeof(Oid) * node->numCols);
+        node->nullsFirst    = (bool*)palloc(sizeof(bool) * node->numCols);
+
+        distinctClause = 
+            generate_setop_grouplist(op->groupClauses,
+                                     node->plan.targetlist,
+                                     root->parse->sortClause);
+        forboth (slc, distinctClause, lc, node->plan.targetlist)
+        {
+            TargetEntry *te = (TargetEntry*) lfirst(lc);
+            SortGroupClause *sgc = (SortGroupClause *) lfirst(slc);
+
+            Assert(sgc->tleSortGroupRef == te->ressortgroupref);
+            node->sortColIdx[i] = i + 1;
+            node->sortOperators[i] = sgc->sortop;
+            node->collations[i] = exprCollation((Node*)te->expr);
+            node->nullsFirst[i] = sgc->nulls_first;
+            i++;
+        }
+        lc = list_head(node->mergeplans);
+        p = (Plan*) lfirst(lc);
+        node->plan.startup_cost = p->startup_cost;
+        node->plan.total_cost   = p->total_cost;
+        node->plan.plan_rows    = p->plan_rows;
+        total_size             = p->plan_rows * p->plan_width;
+        for_each_cell(lc, lnext(lc))
+        {
+            p = (Plan*) lfirst(lc);
+            node->plan.total_cost += p->total_cost;
+            node->plan.plan_rows  += p->plan_rows;
+            total_size            += p->plan_rows * p->plan_width;
+        }
+        node->plan.plan_width = rint(total_size / node->plan.plan_rows);
+        *sortClauses = root->parse->sortClause;
+        plan = (Plan*)node;
+        if (!op->all)
+            plan = make_union_unique(op, plan, root, tuple_fraction,
+                                     &distinctClause);
+    }    else
-        plan = make_union_unique(op, plan, root, tuple_fraction, sortClauses);
+    {
+        /*
+         * Append the child results together.
+         */
+        plan = (Plan *) make_append(planlist, tlist);
+
+        /*
+         * For UNION ALL, we just need the Append plan.  For UNION, need to
+         * add node(s) to remove duplicates.
+         */
+        *sortClauses = NIL;
+
+        if (!op->all)
+            plan = make_union_unique(op, plan, root, tuple_fraction, sortClauses);
+    }    /*     * Estimate number of groups if caller wants it.  For now we just assume
@@ -544,12 +732,14 @@ generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,    lplan =
recurse_set_operations(op->larg,root,                                   0.0 /* all tuples needed */ ,
               op->colTypes, op->colCollations,
 
+                                   op->groupClauses,                                   false, 0,
           refnames_tlist,                                   &child_sortclauses, &dLeftGroups);    rplan =
recurse_set_operations(op->rarg,root,                                   0.0 /* all tuples needed */ ,
               op->colTypes, op->colCollations,
 
+                                   op->groupClauses,                                   false, 1,
           refnames_tlist,                                   &child_sortclauses, &dRightGroups);
 
@@ -589,7 +779,7 @@ generate_nonunion_plan(SetOperationStmt *op, PlannerInfo *root,    plan = (Plan *)
make_append(planlist,tlist);    /* Identify the grouping semantics */
 
-    groupList = generate_setop_grouplist(op, tlist);
+    groupList = generate_setop_grouplist(op->groupClauses, tlist, NULL);    /* punt if nothing to group on (can this
happen?)*/    if (groupList == NIL)
 
@@ -678,9 +868,13 @@ static List *recurse_union_children(Node *setOp, PlannerInfo *root,                       double
tuple_fraction,                      SetOperationStmt *top_union,
 
-                       List *refnames_tlist)
+                       List *refnames_tlist,
+                       List **child_sortclauses){
-    List       *child_sortclauses;
+    List       *lplan, *rplan;
+    List       *lsort, *rsort;
+
+    *child_sortclauses = NIL;    if (IsA(setOp, SetOperationStmt))    {
@@ -691,14 +885,23 @@ recurse_union_children(Node *setOp, PlannerInfo *root,            equal(op->colTypes,
top_union->colTypes))       {            /* Same UNION, so fold children into parent's subplan list */
 
-            return list_concat(recurse_union_children(op->larg, root,
-                                                      tuple_fraction,
-                                                      top_union,
-                                                      refnames_tlist),
-                               recurse_union_children(op->rarg, root,
-                                                      tuple_fraction,
-                                                      top_union,
-                                                      refnames_tlist));
+            lplan = recurse_union_children(op->larg, root,
+                                           tuple_fraction,
+                                           top_union,
+                                           refnames_tlist,
+                                           &lsort);
+            rplan = recurse_union_children(op->rarg, root,
+                                           tuple_fraction,
+                                           top_union,
+                                           refnames_tlist,
+                                           &rsort);
+            /*
+             * Propagate whether all the descendents are sorted with same
+             *  sortClause.
+             */
+            if (lsort != NIL && equal(lsort, rsort))
+                *child_sortclauses = lsort;
+            return list_concat(lplan, rplan);        }    }
@@ -715,13 +918,16 @@ recurse_union_children(Node *setOp, PlannerInfo *root,
tuple_fraction,                                             top_union->colTypes,
    top_union->colCollations,
 
+                                             top_union->groupClauses,
false,-1,                                             refnames_tlist,
 
-                                             &child_sortclauses, NULL));
+                                             child_sortclauses, NULL));}/* * Add nodes to the given plan tree to
unique-ifythe result of a UNION.
 
+ *
+ * If the sortClause is given, we assume the plan is already sorted on it. */static Plan
*make_union_unique(SetOperationStmt*op, Plan *plan,
 
@@ -731,9 +937,11 @@ make_union_unique(SetOperationStmt *op, Plan *plan,    List       *groupList;    double
dNumGroups;   long        numGroups;
 
+    bool        sort_needed = true;    /* Identify the grouping semantics */
-    groupList = generate_setop_grouplist(op, plan->targetlist);
+    groupList = generate_setop_grouplist(op->groupClauses,
+                                         plan->targetlist, *sortClauses);    /* punt if nothing to group on (can this
happen?)*/    if (groupList == NIL)
 
@@ -742,6 +950,9 @@ make_union_unique(SetOperationStmt *op, Plan *plan,        return plan;    }
+    if (*sortClauses && equal(*sortClauses, groupList))
+        sort_needed = false;
+        /*     * XXX for the moment, take the number of distinct groups as equal to the     * total input size, ie,
theworst case.  This is too conservative, but we
 
@@ -756,7 +967,8 @@ make_union_unique(SetOperationStmt *op, Plan *plan,    numGroups = (long) Min(dNumGroups, (double)
LONG_MAX);   /* Decide whether to hash or sort */
 
-    if (choose_hashed_setop(root, groupList, plan,
+    if (sort_needed &&
+        choose_hashed_setop(root, groupList, plan,                            dNumGroups, dNumGroups, tuple_fraction,
                         "UNION"))    {
 
@@ -778,7 +990,9 @@ make_union_unique(SetOperationStmt *op, Plan *plan,    else    {        /* Sort and Unique */
-        plan = (Plan *) make_sort_from_sortclauses(root, groupList, plan);
+        if (sort_needed)
+            plan = (Plan *) make_sort_from_sortclauses(root, groupList, plan);
+                plan = (Plan *) make_unique(plan, groupList);        plan->plan_rows = dNumGroups;        /* We know
thesort order of the result */
 
@@ -1145,23 +1359,34 @@ generate_append_tlist(List *colTypes, List *colCollations, * the parser output representation
doesn'tinclude a tlist for each * setop.  So what we need to do here is copy that list and install * proper
sortgrouprefsinto it and into the targetlist.
 
+ *
+ * sortClause is consulted if provided to avoid re-sorting with different
+ * orderings on the same keys later. */static List *
-generate_setop_grouplist(SetOperationStmt *op, List *targetlist)
+generate_setop_grouplist(List  *groupClauses, List *targetlist, 
+                         List *sortClauses){
-    List       *grouplist = (List *) copyObject(op->groupClauses);
+    List       *grouplist = (List *) copyObject(groupClauses);    ListCell   *lg;
+    ListCell   *ls = NULL;    ListCell   *lt;    Index        refno = 1;    lg = list_head(grouplist);
+
+    if (sortClauses)
+        ls = list_head(sortClauses);
+    foreach(lt, targetlist)    {        TargetEntry *tle = (TargetEntry *) lfirst(lt);
-        SortGroupClause *sgc;
+        SortGroupClause *sgc, *sgc_sort = NULL;
-        /* tlist shouldn't have any sortgrouprefs yet */
-        Assert(tle->ressortgroupref == 0);
+        /*
+         * tlist shouldn't have any sortgrouprefs yet, or should be unchanged
+         */
+        Assert(tle->ressortgroupref == 0 || tle->ressortgroupref == refno);        if (tle->resjunk)
continue;           /* ignore resjunk columns */
 
@@ -1174,6 +1399,45 @@ generate_setop_grouplist(SetOperationStmt *op, List *targetlist)        /* we could use
assignSortGroupRefhere, but seems a bit silly */        sgc->tleSortGroupRef = tle->ressortgroupref = refno++;
 
+        
+        if (ls)
+        {
+            /*
+             * If sortClauses is provided, try to adjust the sorting order to
+             * get the chance for omitting sorting for grouping later.
+             */
+            sgc_sort = (SortGroupClause *) lfirst(ls);
+            ls = lnext(ls);
+            if (sgc->sortop != sgc_sort->sortop)
+            {
+                Oid reverse = InvalidOid;
+                Oid opfamily, opcintype;
+                int16 strategy;
+                
+                if (get_ordering_op_properties(sgc->sortop,
+                                &opfamily, &opcintype, &strategy))
+                {
+                    switch (strategy)
+                    {
+                        case BTLessStrategyNumber:
+                            strategy = BTGreaterStrategyNumber; break;
+                        case BTGreaterStrategyNumber:
+                            strategy = BTLessStrategyNumber; break;
+                    }
+
+                    reverse = get_opfamily_member(opfamily,
+                                                  opcintype,
+                                                  opcintype,
+                                                  strategy);
+                    if (sgc_sort->sortop == reverse)
+                    {
+                        sgc->sortop = reverse;
+                        sgc->nulls_first = sgc_sort->nulls_first;
+                    }
+                }
+            }
+        }
+                }    Assert(lg == NULL);    return grouplist;

pgsql-hackers by date:

Previous
From: Mitsumasa KONDO
Date:
Subject: Add accurate option to pgbench
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Get more from indices.