Thread: UNION ALL on partitioned tables won't use indices.
Hello, Sorry that it's been a while.. 1. Observed symptom As you know, UNION ALL accompanied with ORDER BY uses indexes if available. > uniontest=# EXPLAIN SELECT * FROM c11 UNION ALL SELECT * FROM c12 ORDER BY a; > QUERY PLAN > --------------------------------------------------------------------------- > Merge Append (cost=0.59..10214.10 rows=200000 width=16) > Sort Key: c11.a > -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16) > -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16) So do simple queries on partitioned (inheritance) tables. > uniontest=# EXPLAIN SELECT * FROM p1 ORDER BY a; > QUERY PLAN > ------------------------------------------------------------------------------ > Merge Append (cost=0.73..11392.19 rows=200001 width=16) > Sort Key: p1.a > -> Index Scan using ... on p1 (cost=0.12..8.14 rows=1 width=44) > -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16) > -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16) Nevertheless, UNION ALL on partitioned tables doesn't. This is quite unfavourable behavior especially having LIMIT. >uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 > UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10; > QUERY PLAN >------------------------------------------------------------------------ > Limit (actual time=182.732..182.735 rows=10 loops=1) > -> Sort (actual time=182.729..182.730 rows=10 loops=1) > Sort Key: p1.a > Sort Method: top-N heapsort Memory: 25kB > -> Append (actual time=0.029..108.109 rows=400000 loops=1) > -> Seq Scan on p1 (actual time=0.001..0.001 rows=0 loops=1) > -> Seq Scan on c11 (actual time=0.027..19.074 rows=100000 loops=1) > -> Seq Scan on c12 (actual time=0.014..16.653 rows=100000 loops=1) > -> Seq Scan on p2 (actual time=0.000..0.000 rows=0 loops=1) > -> Seq Scan on c21 (actual time=0.012..16.677 rows=100000 loops=1) > -> Seq Scan on c22 (actual time=0.012..16.794 rows=100000 loops=1) > Total runtime: 182.857 ms 2. The cause In grouping_planner, flattern_simple_union_all creates appendrelinfos for each subqueries then expand_inherited_tables furthur expands the parent tables in each subqueries. Finally, this sequence leaves following structure. Where rte[2] and [3] are subqueries abandoned on the way pulling up rte[4] and [5]. rte[1] Subquery "SELECT*1", inh = 1 +- appinfo[0] - rte[4] Relation "p1", inh = 1 | +- appinfo[2] - rte[6] Relation "p1", inh = 0 | +- appinfo[3] - rte[7] Relation "c11", inh = 0 | +- appinfo[4]- rte[8] Relation "c12", inh = 0 +- appinfo[1] - rte[5] Relation "p2", inh = 1 +- appinfo[5]- rte[9] Relation "p1", inh = 0 +- appinfo[6] - rte[10] Relation "c11", inh = 0 +- appinfo[7] - rte[11] Relation "c12", inh = 0 On the other hand, ec member finally has exprs only for varno = 1, 4 and 5, in other words, it lacks the members for the most descendant RTEs. This is because add_child_rel_equivalences() always inhibits add_eq_member from registering the child_rel's relid on EC. Conequently these grandchild relations does not find index pathkeys for themselves. 3. Measures I could thought up three approaches for the problem. One is to simplly modifying here to give child flag in the parameters of add_eq_member accordig to whether the child_rel is inh or not. This seems to work fine although leaves two levels of MergeAppends (PATCH-1). So the additional patch is attached to collapse these MergeAppends (PATCH-2). This gives the same plan as PATCH-3. > uniontest=# explain analyze select * from p1 union all > select * from p2 order by a limit 10; > QUERY PLAN > ------------------------------------------------------------------------ > Limit (actual time=0.208..0.224 rows=10 loops=1) > -> Merge Append (actual time=0.205..0.218 rows=10 loops=1) > Sort Key: p1.a > -> Merge Append (actual time=0.110..0.120 rows=10 loops=1) > Sort Key: p1.a > -> Index Scan .. on p1 (actual time=0.006..0.006 rows=0 loops=1) > -> Index Scan .. on c11 (actual time=0.054..0.060 rows=10 loops=1) > -> Index Scan .. on c12 (actual time=0.046..0.046 rows=1 loops=1) > -> Merge Append (actual time=0.093..0.093 rows=1 loops=1) > Sort Key: p2.a > -> Index Scan .. on p2 (actual time=0.002..0.002 rows=0 loops=1) > -> Index Scan .. on c21 (actual time=0.047..0.047 rows=1 loops=1) > -> Index Scan .. on c22 (actual time=0.043..0.043 rows=1 loops=1) > Total runtime: 0.448 ms The second is to collapse the appendrel structure shown above to have only single level inheritance. This also seems to work fine. This makes the plan with single level MergeAppend. (PATCH-3) > uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL > SELECT * FROM p2 ORDER BY a LIMIT 10; > QUERY PLAN > ------------------------------------------------------------------- > Limit (actual time=0.095..0.107 rows=10 loops=1) > -> Merge Append (actual time=0.092..0.102 rows=10 loops=1) > Sort Key: p1.a > -> Index Scan ... on p1 (actual time=0.005..0.005 rows=0 loops=1) > -> Index Scan ... on c11 (actual time=0.023..0.030 rows=10 loops=1) > -> Index Scan ... on c12 (actual time=0.020..0.020 rows=1 lops=1) > -> Index Scan ... on p2 (actual time=0.001..0.001 rows=0 loops=1) > -> Index Scan ... on c21 (actual time=0.019..0.019 rows=1 loops=1) > -> Index Scan ... on c22 (actual time=0.019..0.019 rows=1 loops=1) > Total runtime: 0.241 ms The last one is most strait-forward in some sense, and conversely is most ad hoc measure. Modifying expand_inherited_rtentry() to pull up adding appendrels if needed, using the similar manner to PATCH-3. This is added as PATCH-4. what do you think about this? Four patches following are attached to this message. 1. union_inh_idx_typ1_v1_20131024.patch .. PATCH-1 described above. 2. union_inh_idx_typ1_add_v1_20131024.patch .. PATCH-2 described above. 3. union_inh_idx_typ2_v1_20131024.patch .. PATCH-3 described above. 4. union_inh_idx_typ3_v1_20131024.patch .. PATCH-4 described above. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 711b161..734ed47 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1940,6 +1940,7 @@ add_child_rel_equivalences(PlannerInfo *root, Expr *child_expr; Relids new_relids; Relids new_nullable_relids; + bool has_children = false; child_expr = (Expr *) adjust_appendrel_attrs(root, @@ -1969,9 +1970,15 @@ add_child_rel_equivalences(PlannerInfo *root, child_rel->relids); } + /* + * Does this child_rel have children? If and only if so, tell + * add_eq_member to register new_relids to cur_ec. + */ + has_children = + root->simple_rte_array[child_rel->relid]->inh; (void) add_eq_member(cur_ec, child_expr, new_relids, new_nullable_relids, - true, cur_em->em_datatype); + !has_children, cur_em->em_datatype); } } } diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 64b1705..0e3cf4b 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -944,7 +944,8 @@ create_merge_append_path(PlannerInfo *root, MergeAppendPath *pathnode = makeNode(MergeAppendPath); Cost input_startup_cost; Cost input_total_cost; - ListCell *l; + ListCell *l, *lm; + bool collapse_subpaths = false; pathnode->path.pathtype = T_MergeAppend; pathnode->path.parent = rel; @@ -953,6 +954,47 @@ create_merge_append_path(PlannerInfo *root, pathnode->path.pathkeys = pathkeys; pathnode->subpaths= subpaths; + + /* + * If subpaths containes MergeAppendPaths already ordered on the pathkeys + * of the creating node, they can be expanded onto this node. + */ + foreach (lm, subpaths) + { + Path *spath = (Path *) lfirst(lm); + + if (IsA(spath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, spath->pathkeys)) + { + collapse_subpaths = true; + break; + } + } + + if (collapse_subpaths) + { + subpaths = NIL; + + foreach (lm, pathnode->subpaths) + { + MergeAppendPath *mpath = (MergeAppendPath *) lfirst(lm); + ListCell *lcm; + + if (IsA(mpath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, mpath->path.pathkeys)) + { + foreach (lcm, mpath->subpaths) + { + Path *smpath = (Path*) lfirst (lcm); + subpaths = lappend(subpaths, smpath); + } + } + else + subpaths = lappend(subpaths, subpaths); + } + pathnode->subpaths = subpaths; + } + /* * Apply query-wide LIMIT if known and path is for sole base relation. * (Handling this at this low levelis a bit klugy.) diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index d8aa35d..8167583 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -404,6 +404,15 @@ subquery_planner(PlannerGlobal *glob, Query *parse, expand_inherited_tables(root); /* + * Collapse multilevel inheritances into fewer levels. + * + * UNION ALL containing subselects on inherited tables leaves multilevel + * inheritance after calling expand_inherited_tables(). Fold them in + * order to make MergeAppend plan available for such queries. + */ + collapse_appendrels(root); + + /* * Set hasHavingQual to remember if HAVING clause is present. Needed * because preprocess_expression willreduce a constant-true condition to * an empty qual list ... but "HAVING TRUE" is not a semantic no-op. diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..c7933b5 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1211,6 +1218,131 @@ expand_inherited_tables(PlannerInfo *root)}/* + * collapse_appendrels + * Pulling up the descendents by recursively combining successive + * translations. + * + * Although the same thing could be done in adjust_appendrel_attrs(), + * doing it here all through at once is more efficient than individually + * (and maybe repetitively) done there. + */ +void +collapse_appendrels(PlannerInfo *root) +{ + Index last_parent_relid = 1; + AppendRelInfo *last_parent = NULL; + ListCell *lcchild; + ListCell *lcparent; + bool full_search = false; + + foreach(lcchild, root->append_rel_list) + { + AppendRelInfo *ch_appinfo = (AppendRelInfo *)lfirst(lcchild); + Index ch_parent_relid = ch_appinfo->parent_relid; + + if (last_parent_relid != ch_parent_relid) + { + /* + * Find the parent for the current child appinfo if parent relid + * is different from that of preveous child. + */ + do + { + /* + * For most cases, the relations are referred to as the parent + * in their apperarence order in rtable from + * append_rel_list. So start searching for the parent appinfos + * from just after the last parent. If the incremental search + * was failed, retry searching the entire list and exit on + * failure. + */ + if (!last_parent) + { + /* + * If this is the first time or the preveous search was + * failed, set up for full search. + */ + lcparent = list_head(root->append_rel_list); + last_parent_relid = 1; + full_search = true; + } + + last_parent = NULL; + for_each_cell(lcparent, lcparent) + { + /* + * Children's and parents' apprelinfos are bonded via + * rtable relations. So two apprelinfos are in + * parent-child relationship when the child_relid of the + * parent is equal to the parent_relid of the child. + */ + AppendRelInfo *papp = (AppendRelInfo*)lfirst(lcparent); + if (papp->child_relid == ch_parent_relid) + { + last_parent = papp; + last_parent_relid = ch_parent_relid; + + /* Search from the next cell next time. */ + lcparent = lnext(lcparent); + full_search = false; + break; /* Parent found */ + } + } + /* Retry only when incremental search was failed. */ + } while (!full_search && !last_parent); + } + + /* + * Replace child translated_vars so as to be a direct children of the + * topmost relation. + */ + if (last_parent) + { + transvars_merge_context ctx; + + ctx.child_appinfo = ch_appinfo; + ctx.target_rti = last_parent->child_relid; + + ch_appinfo->translated_vars = + (List*)expression_tree_mutator( + (Node*)last_parent->translated_vars, + transvars_merge_mutator, &ctx); + ch_appinfo->parent_relid = last_parent->parent_relid; + ch_appinfo->parent_reltype = last_parent->parent_reltype; + } + } +} + +/* + * Replace Var nodes with corresponding child nodes. + */ +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + return (Node*)copyObject(list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} + +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. * If so, addentries for all the child tables to the query's diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 0934e63..7a83938 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -49,6 +49,7 @@ extern Plan *plan_set_operations(PlannerInfo *root, double tuple_fraction, List **sortClauses);externvoid expand_inherited_tables(PlannerInfo *root); +extern void collapse_appendrels(PlannerInfo *root);extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo); diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..3cba2a1 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1272,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1337,21 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * furthur parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1378,6 +1429,28 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->parent_reloid = parent_appinfo->parent_reloid; + appinfo->translated_vars = (List*)expression_tree_mutator( + parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc)
On Thu, Oct 24, 2013 at 6:39 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > Hello, Sorry that it's been a while.. > > 1. Observed symptom > > As you know, UNION ALL accompanied with ORDER BY uses indexes if > available. > >> uniontest=# EXPLAIN SELECT * FROM c11 UNION ALL SELECT * FROM c12 ORDER BY a; >> QUERY PLAN >> --------------------------------------------------------------------------- >> Merge Append (cost=0.59..10214.10 rows=200000 width=16) >> Sort Key: c11.a >> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16) >> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16) > > So do simple queries on partitioned (inheritance) tables. > >> uniontest=# EXPLAIN SELECT * FROM p1 ORDER BY a; >> QUERY PLAN >> ------------------------------------------------------------------------------ >> Merge Append (cost=0.73..11392.19 rows=200001 width=16) >> Sort Key: p1.a >> -> Index Scan using ... on p1 (cost=0.12..8.14 rows=1 width=44) >> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16) >> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16) > > Nevertheless, UNION ALL on partitioned tables doesn't. This is > quite unfavourable behavior especially having LIMIT. > >>uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 >> UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10; >> QUERY PLAN >>------------------------------------------------------------------------ >> Limit (actual time=182.732..182.735 rows=10 loops=1) >> -> Sort (actual time=182.729..182.730 rows=10 loops=1) >> Sort Key: p1.a >> Sort Method: top-N heapsort Memory: 25kB >> -> Append (actual time=0.029..108.109 rows=400000 loops=1) >> -> Seq Scan on p1 (actual time=0.001..0.001 rows=0 loops=1) >> -> Seq Scan on c11 (actual time=0.027..19.074 rows=100000 loops=1) >> -> Seq Scan on c12 (actual time=0.014..16.653 rows=100000 loops=1) >> -> Seq Scan on p2 (actual time=0.000..0.000 rows=0 loops=1) >> -> Seq Scan on c21 (actual time=0.012..16.677 rows=100000 loops=1) >> -> Seq Scan on c22 (actual time=0.012..16.794 rows=100000 loops=1) >> Total runtime: 182.857 ms > > > 2. The cause > > In grouping_planner, flattern_simple_union_all creates > appendrelinfos for each subqueries then expand_inherited_tables > furthur expands the parent tables in each subqueries. Finally, > this sequence leaves following structure. Where rte[2] and [3] > are subqueries abandoned on the way pulling up rte[4] and [5]. > > rte[1] Subquery "SELECT*1", inh = 1 > +- appinfo[0] - rte[4] Relation "p1", inh = 1 > | +- appinfo[2] - rte[6] Relation "p1", inh = 0 > | +- appinfo[3] - rte[7] Relation "c11", inh = 0 > | +- appinfo[4] - rte[8] Relation "c12", inh = 0 > +- appinfo[1] - rte[5] Relation "p2", inh = 1 > +- appinfo[5] - rte[9] Relation "p1", inh = 0 > +- appinfo[6] - rte[10] Relation "c11", inh = 0 > +- appinfo[7] - rte[11] Relation "c12", inh = 0 > > On the other hand, ec member finally has exprs only for varno = > 1, 4 and 5, in other words, it lacks the members for the most > descendant RTEs. This is because add_child_rel_equivalences() > always inhibits add_eq_member from registering the child_rel's > relid on EC. Conequently these grandchild relations does not find > index pathkeys for themselves. > > > 3. Measures > > I could thought up three approaches for the problem. > > One is to simplly modifying here to give child flag in the > parameters of add_eq_member accordig to whether the child_rel is > inh or not. This seems to work fine although leaves two levels of > MergeAppends (PATCH-1). So the additional patch is attached to > collapse these MergeAppends (PATCH-2). This gives the same plan > as PATCH-3. > >> uniontest=# explain analyze select * from p1 union all >> select * from p2 order by a limit 10; >> QUERY PLAN >> ------------------------------------------------------------------------ >> Limit (actual time=0.208..0.224 rows=10 loops=1) >> -> Merge Append (actual time=0.205..0.218 rows=10 loops=1) >> Sort Key: p1.a >> -> Merge Append (actual time=0.110..0.120 rows=10 loops=1) >> Sort Key: p1.a >> -> Index Scan .. on p1 (actual time=0.006..0.006 rows=0 loops=1) >> -> Index Scan .. on c11 (actual time=0.054..0.060 rows=10 loops=1) >> -> Index Scan .. on c12 (actual time=0.046..0.046 rows=1 loops=1) >> -> Merge Append (actual time=0.093..0.093 rows=1 loops=1) >> Sort Key: p2.a >> -> Index Scan .. on p2 (actual time=0.002..0.002 rows=0 loops=1) >> -> Index Scan .. on c21 (actual time=0.047..0.047 rows=1 loops=1) >> -> Index Scan .. on c22 (actual time=0.043..0.043 rows=1 loops=1) >> Total runtime: 0.448 ms > > > The second is to collapse the appendrel structure shown above to > have only single level inheritance. This also seems to work > fine. This makes the plan with single level > MergeAppend. (PATCH-3) > >> uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL >> SELECT * FROM p2 ORDER BY a LIMIT 10; >> QUERY PLAN >> ------------------------------------------------------------------- >> Limit (actual time=0.095..0.107 rows=10 loops=1) >> -> Merge Append (actual time=0.092..0.102 rows=10 loops=1) >> Sort Key: p1.a >> -> Index Scan ... on p1 (actual time=0.005..0.005 rows=0 loops=1) >> -> Index Scan ... on c11 (actual time=0.023..0.030 rows=10 loops=1) >> -> Index Scan ... on c12 (actual time=0.020..0.020 rows=1 lops=1) >> -> Index Scan ... on p2 (actual time=0.001..0.001 rows=0 loops=1) >> -> Index Scan ... on c21 (actual time=0.019..0.019 rows=1 loops=1) >> -> Index Scan ... on c22 (actual time=0.019..0.019 rows=1 loops=1) >> Total runtime: 0.241 ms > > > > The last one is most strait-forward in some sense, and conversely > is most ad hoc measure. Modifying expand_inherited_rtentry() to > pull up adding appendrels if needed, using the similar manner to > PATCH-3. This is added as PATCH-4. > > > what do you think about this? > > > Four patches following are attached to this message. > > 1. union_inh_idx_typ1_v1_20131024.patch .. PATCH-1 described above. > 2. union_inh_idx_typ1_add_v1_20131024.patch .. PATCH-2 described above. > 3. union_inh_idx_typ2_v1_20131024.patch .. PATCH-3 described above. > 4. union_inh_idx_typ3_v1_20131024.patch .. PATCH-4 described above. Please add your patches to the currently-open CommitFest so that we don't lose track of them. https://commitfest.postgresql.org/action/commitfest_view/open I'm not sure which approach to this problem is best, but I agree that it is worth solving. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hello, > Please add your patches to the currently-open CommitFest so that we > don't lose track of them. > > https://commitfest.postgresql.org/action/commitfest_view/open > > I'm not sure which approach to this problem is best, but I agree that > it is worth solving. Thank you, I've regsitered this on CF3. -- Kyotaro Horiguchi NTT Open Source Software Center
On 10/29/13, 11:05 PM, Kyotaro HORIGUCHI wrote: > Hello, > >> Please add your patches to the currently-open CommitFest so that we >> don't lose track of them. >> >> https://commitfest.postgresql.org/action/commitfest_view/open >> >> I'm not sure which approach to this problem is best, but I agree that >> it is worth solving. > > Thank you, I've regsitered this on CF3. Your patch doesn't apply anymore. Please rebase it.
Umm. I might be working on a bit unstable place.. > Your patch doesn't apply anymore. Please rebase it. Thank you. I rebased all patches to current master. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 711b161..734ed47 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1940,6 +1940,7 @@ add_child_rel_equivalences(PlannerInfo *root, Expr *child_expr; Relids new_relids; Relids new_nullable_relids; + bool has_children = false; child_expr = (Expr *) adjust_appendrel_attrs(root, @@ -1969,9 +1970,15 @@ add_child_rel_equivalences(PlannerInfo *root, child_rel->relids); } + /* + * Does this child_rel have children? If and only if so, tell + * add_eq_member to register new_relids to cur_ec. + */ + has_children = + root->simple_rte_array[child_rel->relid]->inh; (void) add_eq_member(cur_ec, child_expr, new_relids, new_nullable_relids, - true, cur_em->em_datatype); + !has_children, cur_em->em_datatype); } } } diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 64b1705..0e3cf4b 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -944,7 +944,8 @@ create_merge_append_path(PlannerInfo *root, MergeAppendPath *pathnode = makeNode(MergeAppendPath); Cost input_startup_cost; Cost input_total_cost; - ListCell *l; + ListCell *l, *lm; + bool collapse_subpaths = false; pathnode->path.pathtype = T_MergeAppend; pathnode->path.parent = rel; @@ -953,6 +954,47 @@ create_merge_append_path(PlannerInfo *root, pathnode->path.pathkeys = pathkeys; pathnode->subpaths= subpaths; + + /* + * If subpaths containes MergeAppendPaths already ordered on the pathkeys + * of the creating node, they can be expanded onto this node. + */ + foreach (lm, subpaths) + { + Path *spath = (Path *) lfirst(lm); + + if (IsA(spath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, spath->pathkeys)) + { + collapse_subpaths = true; + break; + } + } + + if (collapse_subpaths) + { + subpaths = NIL; + + foreach (lm, pathnode->subpaths) + { + MergeAppendPath *mpath = (MergeAppendPath *) lfirst(lm); + ListCell *lcm; + + if (IsA(mpath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, mpath->path.pathkeys)) + { + foreach (lcm, mpath->subpaths) + { + Path *smpath = (Path*) lfirst (lcm); + subpaths = lappend(subpaths, smpath); + } + } + else + subpaths = lappend(subpaths, subpaths); + } + pathnode->subpaths = subpaths; + } + /* * Apply query-wide LIMIT if known and path is for sole base relation. * (Handling this at this low levelis a bit klugy.) diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 64b1705..0e3cf4b 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -944,7 +944,8 @@ create_merge_append_path(PlannerInfo *root, MergeAppendPath *pathnode = makeNode(MergeAppendPath); Cost input_startup_cost; Cost input_total_cost; - ListCell *l; + ListCell *l, *lm; + bool collapse_subpaths = false; pathnode->path.pathtype = T_MergeAppend; pathnode->path.parent = rel; @@ -953,6 +954,47 @@ create_merge_append_path(PlannerInfo *root, pathnode->path.pathkeys = pathkeys; pathnode->subpaths= subpaths; + + /* + * If subpaths containes MergeAppendPaths already ordered on the pathkeys + * of the creating node, they can be expanded onto this node. + */ + foreach (lm, subpaths) + { + Path *spath = (Path *) lfirst(lm); + + if (IsA(spath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, spath->pathkeys)) + { + collapse_subpaths = true; + break; + } + } + + if (collapse_subpaths) + { + subpaths = NIL; + + foreach (lm, pathnode->subpaths) + { + MergeAppendPath *mpath = (MergeAppendPath *) lfirst(lm); + ListCell *lcm; + + if (IsA(mpath, MergeAppendPath) && + pathkeys_contained_in(pathkeys, mpath->path.pathkeys)) + { + foreach (lcm, mpath->subpaths) + { + Path *smpath = (Path*) lfirst (lcm); + subpaths = lappend(subpaths, smpath); + } + } + else + subpaths = lappend(subpaths, subpaths); + } + pathnode->subpaths = subpaths; + } + /* * Apply query-wide LIMIT if known and path is for sole base relation. * (Handling this at this low levelis a bit klugy.) diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index d8aa35d..8167583 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -404,6 +404,15 @@ subquery_planner(PlannerGlobal *glob, Query *parse, expand_inherited_tables(root); /* + * Collapse multilevel inheritances into fewer levels. + * + * UNION ALL containing subselects on inherited tables leaves multilevel + * inheritance after calling expand_inherited_tables(). Fold them in + * order to make MergeAppend plan available for such queries. + */ + collapse_appendrels(root); + + /* * Set hasHavingQual to remember if HAVING clause is present. Needed * because preprocess_expression willreduce a constant-true condition to * an empty qual list ... but "HAVING TRUE" is not a semantic no-op. diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..c7933b5 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1211,6 +1218,131 @@ expand_inherited_tables(PlannerInfo *root)}/* + * collapse_appendrels + * Pulling up the descendents by recursively combining successive + * translations. + * + * Although the same thing could be done in adjust_appendrel_attrs(), + * doing it here all through at once is more efficient than individually + * (and maybe repetitively) done there. + */ +void +collapse_appendrels(PlannerInfo *root) +{ + Index last_parent_relid = 1; + AppendRelInfo *last_parent = NULL; + ListCell *lcchild; + ListCell *lcparent; + bool full_search = false; + + foreach(lcchild, root->append_rel_list) + { + AppendRelInfo *ch_appinfo = (AppendRelInfo *)lfirst(lcchild); + Index ch_parent_relid = ch_appinfo->parent_relid; + + if (last_parent_relid != ch_parent_relid) + { + /* + * Find the parent for the current child appinfo if parent relid + * is different from that of preveous child. + */ + do + { + /* + * For most cases, the relations are referred to as the parent + * in their apperarence order in rtable from + * append_rel_list. So start searching for the parent appinfos + * from just after the last parent. If the incremental search + * was failed, retry searching the entire list and exit on + * failure. + */ + if (!last_parent) + { + /* + * If this is the first time or the preveous search was + * failed, set up for full search. + */ + lcparent = list_head(root->append_rel_list); + last_parent_relid = 1; + full_search = true; + } + + last_parent = NULL; + for_each_cell(lcparent, lcparent) + { + /* + * Children's and parents' apprelinfos are bonded via + * rtable relations. So two apprelinfos are in + * parent-child relationship when the child_relid of the + * parent is equal to the parent_relid of the child. + */ + AppendRelInfo *papp = (AppendRelInfo*)lfirst(lcparent); + if (papp->child_relid == ch_parent_relid) + { + last_parent = papp; + last_parent_relid = ch_parent_relid; + + /* Search from the next cell next time. */ + lcparent = lnext(lcparent); + full_search = false; + break; /* Parent found */ + } + } + /* Retry only when incremental search was failed. */ + } while (!full_search && !last_parent); + } + + /* + * Replace child translated_vars so as to be a direct children of the + * topmost relation. + */ + if (last_parent) + { + transvars_merge_context ctx; + + ctx.child_appinfo = ch_appinfo; + ctx.target_rti = last_parent->child_relid; + + ch_appinfo->translated_vars = + (List*)expression_tree_mutator( + (Node*)last_parent->translated_vars, + transvars_merge_mutator, &ctx); + ch_appinfo->parent_relid = last_parent->parent_relid; + ch_appinfo->parent_reltype = last_parent->parent_reltype; + } + } +} + +/* + * Replace Var nodes with corresponding child nodes. + */ +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + return (Node*)copyObject(list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} + +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. * If so, addentries for all the child tables to the query's diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 0934e63..7a83938 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -49,6 +49,7 @@ extern Plan *plan_set_operations(PlannerInfo *root, double tuple_fraction, List **sortClauses);externvoid expand_inherited_tables(PlannerInfo *root); +extern void collapse_appendrels(PlannerInfo *root);extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo); diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..3cba2a1 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1272,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1337,21 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * furthur parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1378,6 +1429,28 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->parent_reloid = parent_appinfo->parent_reloid; + appinfo->translated_vars = (List*)expression_tree_mutator( + parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc)
On Thu, Oct 24, 2013 at 07:39:53PM +0900, Kyotaro HORIGUCHI wrote: > 1. Observed symptom > > As you know, UNION ALL accompanied with ORDER BY uses indexes if > available. > So do simple queries on partitioned (inheritance) tables. > Nevertheless, UNION ALL on partitioned tables doesn't. This is > quite unfavourable behavior especially having LIMIT. > > >uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 > > UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10; > > QUERY PLAN > >------------------------------------------------------------------------ > > Limit (actual time=182.732..182.735 rows=10 loops=1) > > -> Sort (actual time=182.729..182.730 rows=10 loops=1) > > Sort Key: p1.a > > Sort Method: top-N heapsort Memory: 25kB > > -> Append (actual time=0.029..108.109 rows=400000 loops=1) > > -> Seq Scan on p1 (actual time=0.001..0.001 rows=0 loops=1) > > -> Seq Scan on c11 (actual time=0.027..19.074 rows=100000 loops=1) > > -> Seq Scan on c12 (actual time=0.014..16.653 rows=100000 loops=1) > > -> Seq Scan on p2 (actual time=0.000..0.000 rows=0 loops=1) > > -> Seq Scan on c21 (actual time=0.012..16.677 rows=100000 loops=1) > > -> Seq Scan on c22 (actual time=0.012..16.794 rows=100000 loops=1) > > Total runtime: 182.857 ms That is indeed surprising. > 2. The cause > > In grouping_planner, flattern_simple_union_all creates > appendrelinfos for each subqueries then expand_inherited_tables > furthur expands the parent tables in each subqueries. Finally, > this sequence leaves following structure. Where rte[2] and [3] > are subqueries abandoned on the way pulling up rte[4] and [5]. > > rte[1] Subquery "SELECT*1", inh = 1 > +- appinfo[0] - rte[4] Relation "p1", inh = 1 > | +- appinfo[2] - rte[6] Relation "p1", inh = 0 > | +- appinfo[3] - rte[7] Relation "c11", inh = 0 > | +- appinfo[4] - rte[8] Relation "c12", inh = 0 > +- appinfo[1] - rte[5] Relation "p2", inh = 1 > +- appinfo[5] - rte[9] Relation "p1", inh = 0 > +- appinfo[6] - rte[10] Relation "c11", inh = 0 > +- appinfo[7] - rte[11] Relation "c12", inh = 0 > > On the other hand, ec member finally has exprs only for varno = > 1, 4 and 5, in other words, it lacks the members for the most > descendant RTEs. This is because add_child_rel_equivalences() > always inhibits add_eq_member from registering the child_rel's > relid on EC. Conequently these grandchild relations does not find > index pathkeys for themselves. I'm unclear on the key ideas behind distinguishing em_is_child members from ordinary EC members. src/backend/optimizer/README says "These members are *not* full-fledged members of the EquivalenceClass and do not affect the class's overall properties at all." Is that an optimization to avoid futile planner work, or is it necessary for correctness? If the latter, what sort of query might give wrong answers if an EquivalenceMember were incorrectly added as em_is_child = false? > 3. Measures > > I could thought up three approaches for the problem. Thanks for writing up multiple options. My tentative preference is for (3), then (2), then (1): > One is to simplly modifying here to give child flag in the > parameters of add_eq_member accordig to whether the child_rel is > inh or not. This seems to work fine although leaves two levels of > MergeAppends (PATCH-1). So the additional patch is attached to > collapse these MergeAppends (PATCH-2). This gives the same plan > as PATCH-3. Revising the append graph this late in planning (create_merge_append_path()) seems like the wrong location. Changing add_child_rel_equivalences() in this way adds further subtlety to the meaning of em_is_child, and I haven't comprehended the full implications. Neither of those objections are fatal, but let's focus on the other two approaches, which avoid those objections. > The second is to collapse the appendrel structure shown above to > have only single level inheritance. This also seems to work > fine. This makes the plan with single level > MergeAppend. (PATCH-3) > The last one is most strait-forward in some sense, and conversely > is most ad hoc measure. Modifying expand_inherited_rtentry() to > pull up adding appendrels if needed, using the similar manner to > PATCH-3. This is added as PATCH-4. The choice between (2) and (3) should depend on the extent of ways in which we can end up with a nested AppendRelInfo graph. If expand_inherited_rtentry() is the only place that can introduce nesting, modifying it to stop doing that makes sense. Otherwise, adding a generic collapse_appendrels() step might help more situations. expand_inherited_rtentry() is the only culprit I can identify, so I lean toward approach-3/PATCH-4. For a data point corroborating that hypothesis, "make check" issues no queries that trigger this warning: --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1924,6 +1924,9 @@ add_child_rel_equivalences(PlannerInfo *root,{ ListCell *lc1; + if (root->simple_rte_array[child_rel->relid]->inh) + elog(WARNING, "unexpected nested append"); + foreach(lc1, root->eq_classes) { EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1); On Wed, Nov 13, 2013 at 05:38:11PM +0900, Kyotaro HORIGUCHI wrote: [Approach (3): union_inh_idx_typ3_v2_20131113.patch] > @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) > } > } > > +static Node * > +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) > +{ > + if (node == NULL) > + return NULL; > + > + if (IsA(node, Var)) > + { > + Var *oldv = (Var*)node; > + > + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) > + { > + if (oldv->varattno > > + list_length(ctx->child_appinfo->translated_vars)) > + elog(ERROR, > + "attribute %d of relation \"%s\" does not exist", > + oldv->varattno, > + get_rel_name(ctx->child_appinfo->parent_reloid)); > + > + return (Node*)copyObject( > + list_nth(ctx->child_appinfo->translated_vars, > + oldv->varattno - 1)); > + } > + } > + return expression_tree_mutator(node, > + transvars_merge_mutator, (void*)ctx); > +} This has roughly the same purpose as adjust_appendrel_attrs_mutator(), but it propagates the change to far fewer node types. Why can it be so much simpler? The translated_vars of parent_appinfo may bear mostly-arbitrary expressions. > + appinfo->translated_vars = (List*)expression_tree_mutator( > + parent_appinfo->translated_vars, > + transvars_merge_mutator, &ctx); I get this warning: prepunion.c: In function `expand_inherited_rtentry': prepunion.c:1450: warning: passing argument 1 of `expression_tree_mutator' from incompatible pointer type I filled out your test case as follows to reproduce your results: create table p1 (a int, b int); create table c11 () inherits (p1); create table c12 () inherits (p1); create table p2 (a int, b int); create table c21 () inherits (p2); create table c22 () inherits (p2); alter table p1 add primary key (a); alter table c11 add primary key (a); alter table c12 add primary key (a); alter table p2 add primary key (a); alter table c21 add primary key (a); alter table c22 add primary key (a); EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10; However, adding a qual to one of the inheritance queries once again defeated MergeAppend with the patches for approaches (2) and (3). The approach-1 patch gives a segfault. Test queries: -- uses MergeAppend for p1 EXPLAIN ANALYZE SELECT * FROM p1 WHERE b <> 0 ORDER BY a; -- no MergeAppend for p1 EXPLAIN ANALYZE SELECT * FROM p1 WHERE b <> 0 UNION ALL SELECT * FROM p2 ORDER BY a; I did not study this further to identify the cause. However, I am suspicious that we're missing much of the work in set_append_rel_size(), notably the transfer of baserestrictinfo to the child. set_append_rel_size() identifies the need to do that work based on AppendRelId parent_relid connectivity, but we can now flatten that graph in an earlier step. Approaches (2) and (3) leave the inheritance parent with rte->inh == true despite no AppendRelInfo pointing to it as their parent. Until now, expand_inherited_rtentry() has been careful to clear rte->inh in such cases. My gut feeling is that we should retain that property; what do you think? Finally, the patch should add test cases. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > I'm unclear on the key ideas behind distinguishing em_is_child members from > ordinary EC members. src/backend/optimizer/README says "These members are > *not* full-fledged members of the EquivalenceClass and do not affect the > class's overall properties at all." Is that an optimization to avoid futile > planner work, or is it necessary for correctness? If the latter, what sort of > query might give wrong answers if an EquivalenceMember were incorrectly added > as em_is_child = false? See commit dd4134ea, which added that text. IIRC, the key point is that among "real" EC members, there will never be duplicates: a Var "x.y" for instance cannot appear in two different ECs, because we'd have merged the two ECs into one instead. However, this property does *not* hold for child EC members. The problem is that two completely unrelated UNION ALL child subselects might have, say, constant "1" in their tlists. This should not lead to concluding that we can merge ECs that mention their UNION ALL parent variables. I've not looked at these patches yet, but if they're touching equivclass.c at all, I'd guess that it's wrong. I think what we want is for appendrel formation to take care of flattening nested appendrels. The core of the problem is that pull_up_simple_union_all handles that for nested UNION ALLs, and expand_inherited_rtentry sees to it by letting find_all_inheritors do the recursion, but the two cases don't know about each other. My gut feeling after two minutes' thought is that the best fix is for expand_inherited_rtentry to notice when the parent table is already an appendrel member, and enlarge that appendrel instead of making a new one. (This depends on the assumption that pull_up_simple_union_all always happens first, but that seems OK.) I'm not sure if that concept exactly corresponds to either PATCH-3 or PATCH-4, but that's the way I'd think about it. > However, adding a qual to one of the inheritance queries once again defeated > MergeAppend with the patches for approaches (2) and (3). That's an independent limitation, see is_safe_append_member: * Also, the child can't have any WHERE quals because there's no place to * put them in an appendrel. (This is a bitannoying...) It'd be nice to fix that, but it's not going to be easy, and it should be a separate patch IMO. It's pretty much unrelated to the question at hand here. > Approaches (2) and (3) leave the inheritance parent with rte->inh == true > despite no AppendRelInfo pointing to it as their parent. Until now, > expand_inherited_rtentry() has been careful to clear rte->inh in such cases. > My gut feeling is that we should retain that property; what do you think? Yes. IIRC, failing to clear rte->inh doesn't actually break anything, but it will lead to wasted work later in the planner. > Finally, the patch should add test cases. Agreed, at least one example showing that flattening does happen. regards, tom lane
On Sat, Nov 23, 2013 at 01:35:32PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > I'm unclear on the key ideas behind distinguishing em_is_child members from > > ordinary EC members. src/backend/optimizer/README says "These members are > > *not* full-fledged members of the EquivalenceClass and do not affect the > > class's overall properties at all." Is that an optimization to avoid futile > > planner work, or is it necessary for correctness? If the latter, what sort of > > query might give wrong answers if an EquivalenceMember were incorrectly added > > as em_is_child = false? > > See commit dd4134ea, which added that text. IIRC, the key point is that > among "real" EC members, there will never be duplicates: a Var "x.y" > for instance cannot appear in two different ECs, because we'd have merged > the two ECs into one instead. However, this property does *not* hold for > child EC members. The problem is that two completely unrelated UNION ALL > child subselects might have, say, constant "1" in their tlists. This > should not lead to concluding that we can merge ECs that mention their > UNION ALL parent variables. Thanks for the pointer; I found things clearer after reviewing the ECs from the test case queries from commits dd4134ea and 57664ed2. > I've not looked at these patches yet, but if they're touching equivclass.c > at all, I'd guess that it's wrong. Only PATCH-1 touched equivclass.c. > My gut feeling after two minutes' thought is that the best fix is for > expand_inherited_rtentry to notice when the parent table is already an > appendrel member, and enlarge that appendrel instead of making a new one. > (This depends on the assumption that pull_up_simple_union_all always > happens first, but that seems OK.) I'm not sure if that concept exactly > corresponds to either PATCH-3 or PATCH-4, but that's the way I'd think > about it. PATCH-4 does approximately that. I agree that's the right general direction. > > However, adding a qual to one of the inheritance queries once again defeated > > MergeAppend with the patches for approaches (2) and (3). > > That's an independent limitation, see is_safe_append_member: > > * Also, the child can't have any WHERE quals because there's no place to > * put them in an appendrel. (This is a bit annoying...) > > It'd be nice to fix that, but it's not going to be easy, and it should > be a separate patch IMO. It's pretty much unrelated to the question at > hand here. After further study, I agree. It would still be good to understand why that test case crashed PATCH-1, then ensure that the other patches don't have a similar lurking bug. An alternative to extending our ability to pull up UNION ALL subqueries, having perhaps broader applicability, would be to push down the possibly-useful sort order to subqueries we can't pull up. We'd sometimes have two levels of MergeAppend, but that could still handily beat the explicit-sort plan. In any case, it is indeed a separate topic. For the sake of the archives, you can get such plans today by manually adding the ORDER BY to the relevant UNION ALL branch: EXPLAIN (ANALYZE) (SELECT oid, * FROM pg_proc WHERE protransform = 0 ORDER BY oid) UNION ALL SELECT oid, * FROM pg_proc ORDERBY 1 LIMIT 5; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.57..1.31 rows=5 width=544) (actual time=0.102..0.155 rows=5 loops=1) -> Merge Append (cost=0.57..748.25 rows=5084width=544) (actual time=0.095..0.130 rows=5 loops=1) Sort Key: pg_proc_1.oid -> Index Scan usingpg_proc_oid_index on pg_proc pg_proc_1 (cost=0.28..332.83 rows=2538 width=544) (actual time=0.058..0.069 rows=3 loops=1) Filter: ((protransform)::oid = 0::oid) -> Index Scan using pg_proc_oid_index on pg_proc (cost=0.28..326.47rows=2546 width=544) (actual time=0.029..0.036 rows=3 loops=1) -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Thank you. I'll soon come up with regress for the patch 3, 4. ===== > > See commit dd4134ea, which added that text. IIRC, the key point is that > > among "real" EC members, there will never be duplicates: a Var "x.y" > > for instance cannot appear in two different ECs, because we'd have merged > > the two ECs into one instead. However, this property does *not* hold for > > child EC members. The problem is that two completely unrelated UNION ALL > > child subselects might have, say, constant "1" in their tlists. This > > should not lead to concluding that we can merge ECs that mention their > > UNION ALL parent variables. > Thanks for the pointer; I found things clearer after reviewing the ECs from > the test case queries from commits dd4134ea and 57664ed2. Is it correct that ec-members are parted into two groups one of which is 'full-fledged' and never duplicate among different ECs so as to be useful for forming pathkeys (parent) and another is immatured and might be duplicate which could not form pathkeys. Thus the new-in-patch-1 third group which is not duplicate (because they should be always be Var "x.y" (for this case)) but immatured (child) currently can not be adequately classified? My PATCH-1 which make them in the third group forcibly classified as 'parent' instead of 'child' as before therefore makes 'broken' ec member? > > I've not looked at these patches yet, but if they're touching equivclass.c > > at all, I'd guess that it's wrong. > > Only PATCH-1 touched equivclass.c. Surely. > > My gut feeling after two minutes' thought is that the best fix is for > > expand_inherited_rtentry to notice when the parent table is already an > > appendrel member, and enlarge that appendrel instead of making a new one. > > (This depends on the assumption that pull_up_simple_union_all always > > happens first, but that seems OK.) I'm not sure if that concept exactly > > corresponds to either PATCH-3 or PATCH-4, but that's the way I'd think > > about it. > > PATCH-4 does approximately that. I agree that's the right > general direction. Ok, I'll provide regress for the patches 3 and 4. > An alternative to extending our ability to pull up UNION ALL subqueries, > having perhaps broader applicability, would be to push down the > possibly-useful sort order to subqueries we can't pull up. We'd sometimes > have two levels of MergeAppend, but that could still handily beat the > explicit-sort plan. In any case, it is indeed a separate topic. For the sake > of the archives, you can get such plans today by manually adding the ORDER BY > to the relevant UNION ALL branch: Yes, but this seems could be done in path-generation phase or earlier (that's mere a smattering). -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, > I'll soon come up with regress for the patch 3, 4. New version patches added with regression tests are attached. I'll show you pulling-up-in-expand_inherited_rtentry version later. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 6670794..5d1cf83 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -404,6 +404,15 @@ subquery_planner(PlannerGlobal *glob, Query *parse, expand_inherited_tables(root); /* + * Collapse multilevel inheritances into fewer levels. + * + * UNION ALL containing subselects on inherited tables leaves multilevel + * inheritance after calling expand_inherited_tables(). Fold them in + * order to make MergeAppend plan available for such queries. + */ + collapse_appendrels(root); + + /* * Set hasHavingQual to remember if HAVING clause is present. Needed * because preprocess_expression willreduce a constant-true condition to * an empty qual list ... but "HAVING TRUE" is not a semantic no-op. diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..c7933b5 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1211,6 +1218,131 @@ expand_inherited_tables(PlannerInfo *root)}/* + * collapse_appendrels + * Pulling up the descendents by recursively combining successive + * translations. + * + * Although the same thing could be done in adjust_appendrel_attrs(), + * doing it here all through at once is more efficient than individually + * (and maybe repetitively) done there. + */ +void +collapse_appendrels(PlannerInfo *root) +{ + Index last_parent_relid = 1; + AppendRelInfo *last_parent = NULL; + ListCell *lcchild; + ListCell *lcparent; + bool full_search = false; + + foreach(lcchild, root->append_rel_list) + { + AppendRelInfo *ch_appinfo = (AppendRelInfo *)lfirst(lcchild); + Index ch_parent_relid = ch_appinfo->parent_relid; + + if (last_parent_relid != ch_parent_relid) + { + /* + * Find the parent for the current child appinfo if parent relid + * is different from that of preveous child. + */ + do + { + /* + * For most cases, the relations are referred to as the parent + * in their apperarence order in rtable from + * append_rel_list. So start searching for the parent appinfos + * from just after the last parent. If the incremental search + * was failed, retry searching the entire list and exit on + * failure. + */ + if (!last_parent) + { + /* + * If this is the first time or the preveous search was + * failed, set up for full search. + */ + lcparent = list_head(root->append_rel_list); + last_parent_relid = 1; + full_search = true; + } + + last_parent = NULL; + for_each_cell(lcparent, lcparent) + { + /* + * Children's and parents' apprelinfos are bonded via + * rtable relations. So two apprelinfos are in + * parent-child relationship when the child_relid of the + * parent is equal to the parent_relid of the child. + */ + AppendRelInfo *papp = (AppendRelInfo*)lfirst(lcparent); + if (papp->child_relid == ch_parent_relid) + { + last_parent = papp; + last_parent_relid = ch_parent_relid; + + /* Search from the next cell next time. */ + lcparent = lnext(lcparent); + full_search = false; + break; /* Parent found */ + } + } + /* Retry only when incremental search was failed. */ + } while (!full_search && !last_parent); + } + + /* + * Replace child translated_vars so as to be a direct children of the + * topmost relation. + */ + if (last_parent) + { + transvars_merge_context ctx; + + ctx.child_appinfo = ch_appinfo; + ctx.target_rti = last_parent->child_relid; + + ch_appinfo->translated_vars = + (List*)expression_tree_mutator( + (Node*)last_parent->translated_vars, + transvars_merge_mutator, &ctx); + ch_appinfo->parent_relid = last_parent->parent_relid; + ch_appinfo->parent_reltype = last_parent->parent_reltype; + } + } +} + +/* + * Replace Var nodes with corresponding child nodes. + */ +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + return (Node*)copyObject(list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} + +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. * If so, addentries for all the child tables to the query's diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 0934e63..7a83938 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -49,6 +49,7 @@ extern Plan *plan_set_operations(PlannerInfo *root, double tuple_fraction, List **sortClauses);externvoid expand_inherited_tables(PlannerInfo *root); +extern void collapse_appendrels(PlannerInfo *root);extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo); diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index ae690cf..8b9f9e2 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -501,9 +501,40 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(6 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +-------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..3cba2a1 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1272,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1337,21 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * furthur parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1378,6 +1429,28 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->parent_reloid = parent_appinfo->parent_reloid; + appinfo->translated_vars = (List*)expression_tree_mutator( + parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc) diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index ae690cf..8b9f9e2 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -501,9 +501,40 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(6 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +-------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off)
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: > My PATCH-1 which make them in the third group forcibly > classified as 'parent' instead of 'child' as before therefore > makes 'broken' ec member? If you marked child entries as non-child, then yes, it's broken. The definition of a regular EC member is that it's always equal to any other regular member of that EC, in any row that's had all available WHERE filter constraints applied to it. This cannot be true for child members because those are only valid in a subset of rows. Consider SELECT * FROM t1, (SELECT y FROM t2 UNION ALL SELECT 0 AS y FROM t3) ss WHERE t1.x = ss.y We will put t1.x and ss.y into an EC, where ss.y is a variable of the appendrel representing the union. Then we'll put t2.y and constant 0 into the EC as child members. It would be incorrect to deduce "t1.x = 0" from this --- while that *is* a valid constraint for join rows arising from rows out of t3, it's incorrect for rows arising from rows out of t2, so it's not universally valid. Basically the point here is that deductions involving child EC members are one-way. You can use them to generate constraints to apply to a scan of the associated appendrel member, but not to generate constraints that would apply to other relations in the query. In a situation where you're considering "grandchild" elements, both those and the original children have to be marked as children, because all of them represent only subsets of rows of the appendrel. regards, tom lane
This is cont'd from previous CF3. You'll see the overview and the discussion since in the thread begins from there. The issue ramains as of current 9.4dev head. http://www.postgresql.org/message-id/20131024.193953.233464126.horiguchi.kyotaro@lab.ntt.co.jp The issue in brief is that UNION ALL on inheritance tables does not make use of indices in contrast to that on bare tables. This is because of appendrels with the depth more than 2 levels brought by inheritance tables. Some of UNION ALL operation - especially using ORDERBY and LIMIT - will be accelerated by this patch. Details in the message above. I proposed 3 types of solution but the one of them - tweaking Equivalence Class (Type 1)- was dismissed because of inappropriate manipulation on Equivalence Class. The rest do essentially the same thing - flattening appendrels - at different timings. In type 2, the process is implemented as a generic appendrel flattening function and applied after expand_inherited_tables() in subquery_planner. In type 3, the equivelant process is combined in expand_inherited_rtentry(). Type 2 loops once for whole appendrel list (in most cases) so it might be more effective than type 3 which searches the parent appendrel for each appended ones. Type 3 is more approprieate design assuming that appendrel tree deeper than 2 levels will be generated only by expand_inherited_tables(). The attached two patches are rebased to current 9.4dev HEAD and make check at the topmost directory and src/test/isolation are passed without error. One bug was found and fixed on the way. It was an assertion failure caused by probably unexpected type conversion introduced by collapse_appendrels() which leads implicit whole-row cast from some valid reltype to invalid reltype (0) into adjust_appendrel_attrs_mutator(). Attached files are the two versions of patches mentioned above. - unionall_inh_idx_typ2_v4_20140114.patch- unionall_inh_idx_typ3_v4_20140114.patch regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..582e8c3 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1272,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1337,21 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * furthur parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1378,6 +1429,28 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->parent_reloid = parent_appinfo->parent_reloid; + appinfo->translated_vars = (List*)expression_tree_mutator( + parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc) @@ -1662,7 +1735,8 @@ adjust_appendrel_attrs_mutator(Node *node, * step to convert the tuple layout to theparent's rowtype. * Otherwise we have to generate a RowExpr. */ - if (OidIsValid(appinfo->child_reltype)) + if (OidIsValid(appinfo->child_reltype) && + OidIsValid(appinfo->parent_reltype)) { Assert(var->vartype == appinfo->parent_reltype); if (appinfo->parent_reltype != appinfo->child_reltype) diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 6f9ee5e..d254ff3 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -502,9 +502,40 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(7 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +-------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 1da4b2f..f7873a9 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -404,6 +404,15 @@ subquery_planner(PlannerGlobal *glob, Query *parse, expand_inherited_tables(root); /* + * Collapse multilevel inheritances into fewer levels. + * + * UNION ALL containing subselects on inherited tables leaves multilevel + * inheritance after calling expand_inherited_tables(). Fold them in + * order to make MergeAppend plan available for such queries. + */ + collapse_appendrels(root); + + /* * Set hasHavingQual to remember if HAVING clause is present. Needed * because preprocess_expression willreduce a constant-true condition to * an empty qual list ... but "HAVING TRUE" is not a semantic no-op. diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index e249628..165c010 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1211,6 +1218,131 @@ expand_inherited_tables(PlannerInfo *root)}/* + * collapse_appendrels + * Pulling up the descendents by recursively combining successive + * translations. + * + * Although the same thing could be done in adjust_appendrel_attrs(), + * doing it here all through at once is more efficient than individually + * (and maybe repetitively) done there. + */ +void +collapse_appendrels(PlannerInfo *root) +{ + Index last_parent_relid = 1; + AppendRelInfo *last_parent = NULL; + ListCell *lcchild; + ListCell *lcparent; + bool full_search = false; + + foreach(lcchild, root->append_rel_list) + { + AppendRelInfo *ch_appinfo = (AppendRelInfo *)lfirst(lcchild); + Index ch_parent_relid = ch_appinfo->parent_relid; + + if (last_parent_relid != ch_parent_relid) + { + /* + * Find the parent for the current child appinfo if parent relid + * is different from that of preveous child. + */ + do + { + /* + * For most cases, the relations are referred to as the parent + * in their apperarence order in rtable from + * append_rel_list. So start searching for the parent appinfos + * from just after the last parent. If the incremental search + * was failed, retry searching the entire list and exit on + * failure. + */ + if (!last_parent) + { + /* + * If this is the first time or the preveous search was + * failed, set up for full search. + */ + lcparent = list_head(root->append_rel_list); + last_parent_relid = 1; + full_search = true; + } + + last_parent = NULL; + for_each_cell(lcparent, lcparent) + { + /* + * Children's and parents' apprelinfos are bonded via + * rtable relations. So two apprelinfos are in + * parent-child relationship when the child_relid of the + * parent is equal to the parent_relid of the child. + */ + AppendRelInfo *papp = (AppendRelInfo*)lfirst(lcparent); + if (papp->child_relid == ch_parent_relid) + { + last_parent = papp; + last_parent_relid = ch_parent_relid; + + /* Search from the next cell next time. */ + lcparent = lnext(lcparent); + full_search = false; + break; /* Parent found */ + } + } + /* Retry only when incremental search was failed. */ + } while (!full_search && !last_parent); + } + + /* + * Replace child translated_vars so as to be a direct children of the + * topmost relation. + */ + if (last_parent) + { + transvars_merge_context ctx; + + ctx.child_appinfo = ch_appinfo; + ctx.target_rti = last_parent->child_relid; + + ch_appinfo->translated_vars = + (List*)expression_tree_mutator( + (Node*)last_parent->translated_vars, + transvars_merge_mutator, &ctx); + ch_appinfo->parent_relid = last_parent->parent_relid; + ch_appinfo->parent_reltype = last_parent->parent_reltype; + } + } +} + +/* + * Replace Var nodes with corresponding child nodes. + */ +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + return (Node*)copyObject(list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} + +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. * If so, addentries for all the child tables to the query's @@ -1662,7 +1794,8 @@ adjust_appendrel_attrs_mutator(Node *node, * step to convert the tuple layout to theparent's rowtype. * Otherwise we have to generate a RowExpr. */ - if (OidIsValid(appinfo->child_reltype)) + if (OidIsValid(appinfo->child_reltype) && + OidIsValid(appinfo->parent_reltype)) { Assert(var->vartype == appinfo->parent_reltype); if (appinfo->parent_reltype != appinfo->child_reltype) diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 0934e63..7a83938 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -49,6 +49,7 @@ extern Plan *plan_set_operations(PlannerInfo *root, double tuple_fraction, List **sortClauses);externvoid expand_inherited_tables(PlannerInfo *root); +extern void collapse_appendrels(PlannerInfo *root);extern Node *adjust_appendrel_attrs(PlannerInfo *root, Node *node, AppendRelInfo *appinfo); diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 6f9ee5e..d254ff3 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -502,9 +502,40 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(7 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +-------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off)
On Tue, Jan 14, 2014 at 06:04:47PM +0900, Kyotaro HORIGUCHI wrote: > I proposed 3 types of solution but the one of them - tweaking > Equivalence Class (Type 1)- was dismissed because of > inappropriate manipulation on Equivalence Class. The rest do > essentially the same thing - flattening appendrels - at different > timings. > > In type 2, the process is implemented as a generic appendrel > flattening function and applied after expand_inherited_tables() > in subquery_planner. > > In type 3, the equivelant process is combined in > expand_inherited_rtentry(). > > Type 2 loops once for whole appendrel list (in most cases) so it > might be more effective than type 3 which searches the parent > appendrel for each appended ones. Type 3 is more approprieate > design assuming that appendrel tree deeper than 2 levels will be > generated only by expand_inherited_tables(). Let's focus on type 3; Tom and I both found it most promising. > The attached two patches are rebased to current 9.4dev HEAD and > make check at the topmost directory and src/test/isolation are > passed without error. One bug was found and fixed on the way. It > was an assertion failure caused by probably unexpected type > conversion introduced by collapse_appendrels() which leads > implicit whole-row cast from some valid reltype to invalid > reltype (0) into adjust_appendrel_attrs_mutator(). What query demonstrated this bug in the previous type 2/3 patches? > - unionall_inh_idx_typ3_v4_20140114.patch You have not addressed my review comments from November: http://www.postgresql.org/message-id/20131123073913.GA1008138@tornado.leadboat.com Specifically, these: [transvar_merge_mutator()] has roughly the same purpose as adjust_appendrel_attrs_mutator(), but it propagates the change to far fewer node types. Why this case so much simpler? The parent translated_vars of parent_appinfo may bear mostly-arbitrary expressions. ... I get this warning: prepunion.c: In function `expand_inherited_rtentry': prepunion.c:1450: warning: passing argument 1 of `expression_tree_mutator' from incompatible pointer type ... Approaches (2) and (3) leave the inheritance parent with rte->inh == true despite no AppendRelInfo pointing to it as their parent. Until now, expand_inherited_rtentry() has been careful to clear rte->inh in such cases. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Hello, Thank you, and I' sorry to have kept you waiting. > Let's focus on type 3; Tom and I both found it most promising. Agreed. > > The attached two patches are rebased to current 9.4dev HEAD and > > make check at the topmost directory and src/test/isolation are > > passed without error. One bug was found and fixed on the way. It > > was an assertion failure caused by probably unexpected type > > conversion introduced by collapse_appendrels() which leads > > implicit whole-row cast from some valid reltype to invalid > > reltype (0) into adjust_appendrel_attrs_mutator(). > > What query demonstrated this bug in the previous type 2/3 patches? > > > - unionall_inh_idx_typ3_v4_20140114.patch > > You have not addressed my review comments from November: > http://www.postgresql.org/message-id/20131123073913.GA1008138@tornado.leadboat.com mmm. Sorry, I've overlooked most of it.. em_is_child is no more a issue, and the rest seem to cited below, thanks. > Specifically, these: > > [transvar_merge_mutator()] has roughly the same purpose as > adjust_appendrel_attrs_mutator(), but it propagates the change to far fewer > node types. Why this case so much simpler? The parent translated_vars of > parent_appinfo may bear mostly-arbitrary expressions. There are only two places where AppendRelInfo node is generated, expand_inherited_rtentry and pull_up_union_leaf_queries.(_copyAppendrelInfo is irrelevant to this discussion.) The core part generating translated_vars for them are make_inh_translation_list and make_setop_translation_list respectively. That's all. And they are essentially works on the same way, making a Var for every referred target entry of children like following. | makeVar(varno, | tle->resno, | exprType((Node *) tle->expr), | exprTypmod((Node *) tle->expr), | exprCollation((Node *) tle->expr), | 0); So all we should do to collapse nested appendrels is simplly connecting each RTEs directly to the root (most ancient?) RTE in the relationship, resolving Vars like above, as seen in patched expand_inherited_rtentry. # If translated_vars are generated always in the way shown above, # using tree walker might be no use.. This can be done apart from all other stuffs compensating translation skew done in adjust_appendrel_attrs. I believe they are in orthogonal relationship. > Approaches (2) and (3) leave the inheritance parent with rte->inh == true > despite no AppendRelInfo pointing to it as their parent. Until now, > expand_inherited_rtentry() has been careful to clear rte->inh in such cases. Thank you. I missed that point. rte->inh at first works as a trigger to try expand inheritance and create append_rel_list entries, and later works to say "dig me through appinfos". From that point of view, inh of the RTEs whose children took away should be 0. The two meanings of inh are now become different from each other so I suppose it'd better rename it, but I don't come up with good alternatives.. Anyway this is corrected in the patch attached and works as follows, BEFORE:rte[1] Subquery "SELECT*1", inh = 1 +- appinfo[0] - rte[4] Relation "p1", inh = 1 | +- appinfo[2]- rte[6] Relation "p1", inh = 0 | +- appinfo[3] - rte[7] Relation "c11", inh = 0 | +- appinfo[4] - rte[8] Relation "c12", inh = 0 +- appinfo[1] - rte[5] Relation "p2", inh = 1 +-appinfo[5] - rte[9] Relation "p1", inh = 0 +- appinfo[6] - rte[10] Relation "c11", inh = 0 +- appinfo[7] - rte[11] Relation "c12", inh = 0 COLLAPSED:rte[1] Subquery "SELECT*1", inh = 1 +- appinfo[0] - rte[4] Relation "p1", inh = 1 => 0 +- appinfo[2] - rte[6] Relation "p1", inh = 0 +- appinfo[3] - rte[7] Relation "c11", inh = 0 +- appinfo[4] - rte[8] Relation "c12",inh = 0 +- appinfo[1] - rte[5] Relation "p2", inh = 1 => 0 +- appinfo[5] - rte[9] Relation "p1", inh = 0 +- appinfo[6] - rte[10] Relation "c11", inh = 0 +- appinfo[7] - rte[11] Relation "c12", inh = 0 > I get this warning: > > prepunion.c: In function `expand_inherited_rtentry': > prepunion.c:1450: warning: passing argument 1 of `expression_tree_mutator' from incompatible pointer type Sorry, I forgot to put a casting to generic type. It is fixed in the attached version. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index 52dcc72..6ef82d7 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,11 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +103,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1217,34 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1272,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1337,21 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * further parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1308,6 +1359,14 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) /* Scan the inheritanceset and expand it */ appinfos = NIL; + + /* + * This rte will lose all children being pulled up later if it has further + * parent + */ + if (parent_appinfo) + rte->inh = false; + foreach(l, inhOIDs) { Oid childOID = lfirst_oid(l); @@ -1378,6 +1437,28 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->parent_reloid = parent_appinfo->parent_reloid; + appinfo->translated_vars = (List*)expression_tree_mutator( + (Node*)parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc) @@ -1662,7 +1743,8 @@ adjust_appendrel_attrs_mutator(Node *node, * step to convert the tuple layout to theparent's rowtype. * Otherwise we have to generate a RowExpr. */ - if (OidIsValid(appinfo->child_reltype)) + if (OidIsValid(appinfo->child_reltype) && + OidIsValid(appinfo->parent_reltype)) { Assert(var->vartype == appinfo->parent_reltype); if (appinfo->parent_reltype != appinfo->child_reltype) diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 6f9ee5e..f69e92b 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -502,9 +502,42 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(7 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +--------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t2_pkey on t2 + -> Index Scan using t1_ab_idx on t1 t1_1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 t2_1 + -> Index Scan using t2c_pkey on t2c +(9 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off)
On Tue, Jan 28, 2014 at 06:53:32PM +0900, Kyotaro HORIGUCHI wrote: > Hello, Thank you, and I' sorry to have kept you waiting. No hurry. > > > The attached two patches are rebased to current 9.4dev HEAD and > > > make check at the topmost directory and src/test/isolation are > > > passed without error. One bug was found and fixed on the way. It > > > was an assertion failure caused by probably unexpected type > > > conversion introduced by collapse_appendrels() which leads > > > implicit whole-row cast from some valid reltype to invalid > > > reltype (0) into adjust_appendrel_attrs_mutator(). > > > > What query demonstrated this bug in the previous type 2/3 patches? I would still like to know the answer to the above question. > > [transvar_merge_mutator()] has roughly the same purpose as > > adjust_appendrel_attrs_mutator(), but it propagates the change to far fewer > > node types. Why this case so much simpler? The parent translated_vars of > > parent_appinfo may bear mostly-arbitrary expressions. > > There are only two places where AppendRelInfo node is generated, > expand_inherited_rtentry and > pull_up_union_leaf_queries.(_copyAppendrelInfo is irrelevant to > this discussion.) The core part generating translated_vars for > them are make_inh_translation_list and > make_setop_translation_list respectively. That's all. And they > are essentially works on the same way, making a Var for every > referred target entry of children like following. > > | makeVar(varno, > | tle->resno, > | exprType((Node *) tle->expr), > | exprTypmod((Node *) tle->expr), > | exprCollation((Node *) tle->expr), > | 0); It's true that each AppendRelInfo is initially created that way, but they need not remain so simple. You can assume ctx->child_appinfo->translated_vars contains only these Var nodes, but parent_appinfo->translated_vars may contain arbitrary expressions. (I think the pullup_replace_vars() call at prepjointree.c:954 is where an AppendRelInfo can get non-Var expressions.) > So all we should do to collapse nested appendrels is simplly > connecting each RTEs directly to the root (most ancient?) RTE in > the relationship, resolving Vars like above, as seen in patched > expand_inherited_rtentry. > > # If translated_vars are generated always in the way shown above, > # using tree walker might be no use.. > > This can be done apart from all other stuffs compensating > translation skew done in adjust_appendrel_attrs. I believe they > are in orthogonal relationship. Here is a test case that provokes an assertion failure under the patch; I believe the cause is oversimplification in transvars_merge_mutator(): create table parent (a int, b int); create table child () inherits (parent); select parent from parent union all select parent from parent; While poking at that test case, I noticed that the following test emits three rows on HEAD and wrongly emits four rows with the patch: create table parent (a int, b int); create table child () inherits (parent); insert into parent values (1,10); insert into child values (2,20); select a, b from parent union all select a, b from child; -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Hello, > No hurry. Thanks. > > > > The attached two patches are rebased to current 9.4dev HEAD and > > > > make check at the topmost directory and src/test/isolation are > > > > passed without error. One bug was found and fixed on the way. It > > > > was an assertion failure caused by probably unexpected type > > > > conversion introduced by collapse_appendrels() which leads > > > > implicit whole-row cast from some valid reltype to invalid > > > > reltype (0) into adjust_appendrel_attrs_mutator(). > > > > > > What query demonstrated this bug in the previous type 2/3 patches? > > I would still like to know the answer to the above question. Ouch! Sorry but I couldn't replay the bug just now, please wait for a while. > It's true that each AppendRelInfo is initially created that way, but they need > not remain so simple. You can assume ctx->child_appinfo->translated_vars > contains only these Var nodes, but parent_appinfo->translated_vars may contain > arbitrary expressions. (I think the pullup_replace_vars() call at > prepjointree.c:954 is where an AppendRelInfo can get non-Var expressions.) Itself is not a problem. My patch doesn't replace parent's sole Var at top-level with the corresponding node in child, it repaces any Var node in parent's expressions in translated_vars with the corresponding node in child. So expressions in FROM clauses of union's-operand queries can adequately modifies expressions made in prepjointree. The query like follows returns correct result with this patch. As I mentioned before, I think the other stuffs than Var pullup would be done in adjust_appendrel_attrs separately. | SELECT (a + 1) as x, (b * 3) as y FROM p1 | UNION ALL | SELECT (a * 2), (b / 5) FROM p2 ORDER BY x LIMIT 10; > > So all we should do to collapse nested appendrels is simplly > > connecting each RTEs directly to the root (most ancient?) RTE in > > the relationship, resolving Vars like above, as seen in patched > > expand_inherited_rtentry. > > > > # If translated_vars are generated always in the way shown above, > > # using tree walker might be no use.. > > > > This can be done apart from all other stuffs compensating > > translation skew done in adjust_appendrel_attrs. I believe they > > are in orthogonal relationship. > > Here is a test case that provokes an assertion failure under the patch; I > believe the cause is oversimplification in transvars_merge_mutator(): > > create table parent (a int, b int); > create table child () inherits (parent); > select parent from parent union all select parent from parent; Wow. Ok, I'm pretty convinced that you're right. I'll dig it on. > While poking at that test case, I noticed that the following test emits three > rows on HEAD and wrongly emits four rows with the patch: > > create table parent (a int, b int); > create table child () inherits (parent); > insert into parent values (1,10); > insert into child values (2,20); > select a, b from parent union all select a, b from child; Mmm. I had the same result. Please let me have a bit more time. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, The attached file "unionall_inh_idx_typ3_v6_20140203.patch" is the new version of this patch fixed the 'whole-row-var' bug. First of all, on second thought about this, > > create table parent (a int, b int); > > create table child () inherits (parent); > > insert into parent values (1,10); > > insert into child values (2,20); > > select a, b from parent union all select a, b from child; > > Mmm. I had the same result. Please let me have a bit more time. This turned out to be a correct result. The two tables have following records after the two INSERTs. | =# select * from only parent; | 1 | 10 | (1 row) | | =# select * from child; | 2 | 20 | (1 row) Then it is natural that the parent-side in the UNION ALL returns following results. | =# select * from parent; | a | b | ---+---- | 1 | 10 | 2 | 20 | (2 rows) Finally, the result we got has proved not to be a problem. ==== Second, about the crash in this sql, > select parent from parent union all select parent from parent; It is ignored whole-row reference (=0) which makes the index of child translated-vars list invalid (-1). I completely ignored it in spite that myself referred to before. Unfortunately ConvertRowtypeExpr prevents appendrels from being removed currently, and such a case don't seem to take place so often, so I decided to exclude the case. In addition, I found that only setting "rte->inh = false" causes duplicate scanning on the same relations through abandoned appendrels so I set parent/child_relid to InvalidOid so as no more to referred to from the ex-parent (and ex-children). The following queries seems to work correctly (but with no optimization) after these fixes. create table parent (a int, b int); create table child () inherits (parent); insert into parent values (1,10); insert into child values (2,20); select parent from parent union all select parent from parent;parent --------(1,10)(2,20)(1,10)(2,20) (4 rows) select a, parent from parent union all select a,parent from parent;a | parent ---+--------1 | (1,10)2 | (2,20)1 | (1,10)2 | (2,20) (4 rows) select a, b from parent union all select a, b from parent;a | b ---+----1 | 102 | 201 | 102 | 20 (4 rows) select a, b from parent union all select a, b from child;a | b ---+----2 | 201 | 102 | 20 (3 rows) > > > > > The attached two patches are rebased to current 9.4dev HEAD and > > > > > make check at the topmost directory and src/test/isolation are > > > > > passed without error. One bug was found and fixed on the way. It > > > > > was an assertion failure caused by probably unexpected type > > > > > conversion introduced by collapse_appendrels() which leads > > > > > implicit whole-row cast from some valid reltype to invalid > > > > > reltype (0) into adjust_appendrel_attrs_mutator(). > > > > > > > > What query demonstrated this bug in the previous type 2/3 patches? > > > > I would still like to know the answer to the above question. I rememberd after some struggles. It failed during 'make check', on the following query in inherit.sql. | update bar set f2 = f2 + 100 | from | ( select f1 from foo union all select f1+3 from foo ) ss | where bar.f1 = ss.f1; The following SQLs causes the same type of crash. create temp table foo(f1 int, f2 int); create temp table foo2(f3 int) inherits (foo); create temp table bar(f1 int, f2 int); update bar set f2 = 1 from ( select f1 from foo union all select f1+3 from foo ) ss where bar.f1 = ss.f1; The tipped stone was "wholerow1" for the subquery created in preprocess_targetlist. | /* Not a table, so we need the whole row as a junk var */ | var = makeWholeRowVar(rt_fetch(rc->rti, range_table), .. | snprintf(resname, sizeof(resname), "wholerow%u", rc->rowmarkId); Then the finishing blow was a appendRelInfo corresponding to the var above, whose parent_reltype = 0 and child_reltype != 0, and of course introduced by my patch. Since such a situation takes place even for this patch, the modification is left alone. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index 52dcc72..ece9318 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -57,6 +57,12 @@ typedef struct AppendRelInfo *appinfo;} adjust_appendrel_attrs_context; +typedef struct { + AppendRelInfo *child_appinfo; + Index target_rti; + bool failed; +} transvars_merge_context; +static Plan *recurse_set_operations(Node *setOp, PlannerInfo *root, double tuple_fraction, List *colTypes, List *colCollations, @@ -98,6 +104,8 @@ static List *generate_append_tlist(List *colTypes, List *colCollations, List *input_plans, List *refnames_tlist);static List *generate_setop_grouplist(SetOperationStmt *op, List*targetlist); +static Node *transvars_merge_mutator(Node *node, + transvars_merge_context *ctx);static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti);static void make_inh_translation_list(Relation oldrelation, @@ -1210,6 +1218,48 @@ expand_inherited_tables(PlannerInfo *root) }} +static Node * +transvars_merge_mutator(Node *node, transvars_merge_context *ctx) +{ + if (node == NULL) + return NULL; + + /* fast path after failure */ + if (ctx->failed) + return node; + + if (IsA(node, Var)) + { + Var *oldv = (Var*)node; + + if (!oldv->varlevelsup && oldv->varno == ctx->target_rti) + { + if (oldv->varattno == 0) + { + /* + * Appendrels which does whole-row-var conversion cannot be + * removed. ConvertRowtypeExpr can convert only RELs which can + * be referred to using relid. + */ + ctx->failed = true; + return node; + } + if (oldv->varattno > + list_length(ctx->child_appinfo->translated_vars)) + elog(ERROR, + "attribute %d of relation \"%s\" does not exist", + oldv->varattno, + get_rel_name(ctx->child_appinfo->parent_reloid)); + + return (Node*)copyObject( + list_nth(ctx->child_appinfo->translated_vars, + oldv->varattno - 1)); + } + } + return expression_tree_mutator(node, + transvars_merge_mutator, (void*)ctx); +} +/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. @@ -1237,6 +1287,8 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) List *inhOIDs; List *appinfos; ListCell *l; + AppendRelInfo *parent_appinfo = NULL; + bool detach_parent = false; /* Does RT entry allow inheritance? */ if (!rte->inh) @@ -1301,6 +1353,22 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) oldrc->isParent= true; /* + * If parent relation is appearing in a subselect of UNION ALL, it has + * further parent appendrelinfo. Save it to pull up inheritance children + * later. + */ + foreach(l, root->append_rel_list) + { + AppendRelInfo *appinfo = (AppendRelInfo *)lfirst(l); + if(appinfo->child_relid == rti) + { + parent_appinfo = appinfo; + detach_parent = true; + break; + } + } + + /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter alreadydid. */ @@ -1308,6 +1376,7 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) /* Scan the inheritanceset and expand it */ appinfos = NIL; + foreach(l, inhOIDs) { Oid childOID = lfirst_oid(l); @@ -1378,6 +1447,46 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) } /* + * Pull up this appinfo onto just above of the parent. The parent + * relation has its own parent when it appears as a subquery of UNION + * ALL. Pulling up these children gives a chance to consider + * MergeAppend on whole the UNION ALL tree. + */ + + if (parent_appinfo) + { + transvars_merge_context ctx; + + ctx.child_appinfo = appinfo; + ctx.target_rti = rti; + + if (parent_appinfo->parent_reltype == 0 && + parent_appinfo->child_reltype == 0) + { + List *new_transvars; + + /* + * Connect this appinfo up to the parent RTE of + * parent_appinfo. + */ + ctx.failed = false; + new_transvars = (List*)expression_tree_mutator( + (Node*)parent_appinfo->translated_vars, + transvars_merge_mutator, &ctx); + + if (ctx.failed) + /* Some children remain so this parent cannot detach */ + detach_parent = false; + else + { + appinfo->parent_relid = parent_appinfo->parent_relid; + appinfo->parent_reltype = parent_appinfo->parent_reltype; + appinfo->translated_vars = new_transvars; + } + } + } + + /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. */ if (oldrc) @@ -1399,6 +1508,14 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) heap_close(newrelation,NoLock); } + /* Remove childless appinfo from inheritance tree */ + if (detach_parent) + { + rte->inh = false; + parent_appinfo->parent_relid = InvalidOid; + parent_appinfo->child_relid = InvalidOid; + } + heap_close(oldrelation, NoLock); /* @@ -1662,7 +1779,8 @@ adjust_appendrel_attrs_mutator(Node *node, * step to convert the tuple layout to theparent's rowtype. * Otherwise we have to generate a RowExpr. */ - if (OidIsValid(appinfo->child_reltype)) + if (OidIsValid(appinfo->child_reltype) && + OidIsValid(appinfo->parent_reltype)) { Assert(var->vartype == appinfo->parent_reltype); if (appinfo->parent_reltype != appinfo->child_reltype) diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 6f9ee5e..d254ff3 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -502,9 +502,40 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(7 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +NOTICE: merging column "ab" with inherited definition +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; + QUERY PLAN +-------------------------------------------------- + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_expr_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..7f9a9b1 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,29 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (like t1 including indexes) inherits (t1); +CREATE TEMP TABLE t2c (like t2 including indexes) inherits (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'); +INSERT INTO t2c VALUES ('vw'), ('cd'); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT * FROM t2) t + ORDER BY ab LIMIT 1; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off)
On Mon, Feb 03, 2014 at 07:36:22PM +0900, Kyotaro HORIGUCHI wrote: > > > create table parent (a int, b int); > > > create table child () inherits (parent); > > > insert into parent values (1,10); > > > insert into child values (2,20); > > > select a, b from parent union all select a, b from child; > > > > Mmm. I had the same result. Please let me have a bit more time. > > This turned out to be a correct result. The two tables have > following records after the two INSERTs. > > | =# select * from only parent; > | 1 | 10 > | (1 row) > | > | =# select * from child; > | 2 | 20 > | (1 row) > > Then it is natural that the parent-side in the UNION ALL returns > following results. > > | =# select * from parent; > | a | b > | ---+---- > | 1 | 10 > | 2 | 20 > | (2 rows) > > Finally, the result we got has proved not to be a problem. The first union branch should return two rows, and the second union branch should return one row, for a total of three. In any case, I see later in your mail that you fixed this. The larger point is that this patch has no business changing the output rows of *any* query. Its goal is to pick a more-efficient plan for arriving at the same answer. If there's a bug in our current output for some query, that's a separate discussion from the topic of this thread. > Second, about the crash in this sql, > > > select parent from parent union all select parent from parent; > > It is ignored whole-row reference (=0) which makes the index of > child translated-vars list invalid (-1). I completely ignored it > in spite that myself referred to before. > > Unfortunately ConvertRowtypeExpr prevents appendrels from being > removed currently, and such a case don't seem to take place so > often, so I decided to exclude the case. > + /* > + * Appendrels which does whole-row-var conversion cannot be > + * removed. ConvertRowtypeExpr can convert only RELs which can > + * be referred to using relid. We have parent and child relids, so it is not clear to me how imposing that restriction helps us. I replaced transvars_merge_mutator() with a call to adjust_appendrel_attrs(). This reduces code duplication, and it handles whole-row references. (I don't think the other nodes adjust_appendrel_attrs() can handle matter to this caller. translated_vars will never contain join tree nodes, and I doubt it could contain a PlaceHolderVar with phrels requiring translation.) The central design question for this patch seems to be how to represent, in the range table, the fact that we expanded an inheritance parent despite its children ending up as appendrel children of a freestanding UNION ALL. The v6 patch detaches the original RTE from the join tree and clears its "inh" flag. This breaks sepgsql_dml_privileges(), which looks for RTE_RELATION with inh = true and consults selinux concerning every child table. We could certainly change the way sepgsql discovers inheritance hierarchies, but nothing clearly better comes to mind. I selected the approach of preserving the RTE's "inh" flag, removing the AppendRelInfo connecting that RTE to its enclosing UNION ALL, and creating no AppendRelInfo children for that RTE. An alternative was to introduce a new RTE flag, say "append". An inheritance parent under a UNION ALL would have append = false, inh = true; other inheritance parent RTEs would have append = true, inh = true; an RTE for UNION ALL itself would have append = true, inh = false. > > > > > > The attached two patches are rebased to current 9.4dev HEAD and > > > > > > make check at the topmost directory and src/test/isolation are > > > > > > passed without error. One bug was found and fixed on the way. It > > > > > > was an assertion failure caused by probably unexpected type > > > > > > conversion introduced by collapse_appendrels() which leads > > > > > > implicit whole-row cast from some valid reltype to invalid > > > > > > reltype (0) into adjust_appendrel_attrs_mutator(). > > > > > > > > > > What query demonstrated this bug in the previous type 2/3 patches? > > > > > > I would still like to know the answer to the above question. > > I rememberd after some struggles. It failed during 'make check', > on the following query in inherit.sql. > [details] Interesting. Today, the parent_reltype and child_reltype fields of AppendRelInfo are either both valid or both invalid. Your v6 patch allowed us to have a valid child_reltype with an invalid parent_reltype. At the moment, we can't benefit from just one valid reltype. I restored the old invariant. If the attached patch version looks reasonable, I will commit it. Incidentally, I tried adding an assertion that append_rel_list does not show one appendrel as a direct child of another. The following query, off-topic for the patch at hand, triggered that assertion: SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 UNION ALL SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Attachment
Thank you for the labor for polishing this patch. I have no obvious objection at a glance on this new patch. I agree to commit this if you this is pertinent to commit except for the issue newly revealed by this patch. Though could you let me have some more time to examine this by myself and fully understand the changes what you made? At Wed, 26 Feb 2014 23:29:35 -0500, Noah Misch wrote > > Finally, the result we got has proved not to be a problem. > > The first union branch should return two rows, and the second union branch > should return one row, for a total of three. In any case, I see later in your > mail that you fixed this. I got it. I confirmed that *after* fixing duplicate rows, then misunderstood your point. > The larger point is that this patch has no business > changing the output rows of *any* query. Its goal is to pick a more-efficient > plan for arriving at the same answer. If there's a bug in our current output > for some query, that's a separate discussion from the topic of this thread. Totally agreed. > > Second, about the crash in this sql, > > > > > select parent from parent union all select parent from parent; > > > > It is ignored whole-row reference (=0) which makes the index of > > child translated-vars list invalid (-1). I completely ignored it > > in spite that myself referred to before. > > > > Unfortunately ConvertRowtypeExpr prevents appendrels from being > > removed currently, and such a case don't seem to take place so > > often, so I decided to exclude the case. > > > + /* > > + * Appendrels which does whole-row-var conversion cannot be > > + * removed. ConvertRowtypeExpr can convert only RELs which can > > + * be referred to using relid. > > We have parent and child relids, so it is not clear to me how imposing that > restriction helps us. I replaced transvars_merge_mutator() with a call to > adjust_appendrel_attrs(). This reduces code duplication, and it handles > whole-row references. Thank you, I didn't grasp them as a whole.. > (I don't think the other nodes adjust_appendrel_attrs() > can handle matter to this caller. translated_vars will never contain join > tree nodes, and I doubt it could contain a PlaceHolderVar with phrels > requiring translation.) > > The central design question for this patch seems to be how to represent, in > the range table, the fact that we expanded an inheritance parent despite its > children ending up as appendrel children of a freestanding UNION ALL. The v6 > patch detaches the original RTE from the join tree and clears its "inh" flag. > This breaks sepgsql_dml_privileges(), which looks for RTE_RELATION with inh = > true and consults selinux concerning every child table. Mmm. It was not in my sight. A bit more time is needed to understand this. > We could certainly > change the way sepgsql discovers inheritance hierarchies, but nothing clearly > better comes to mind. I selected the approach of preserving the RTE's "inh" > flag, removing the AppendRelInfo connecting that RTE to its enclosing UNION > ALL, and creating no AppendRelInfo children for that RTE. Ok, I did that because I was not certain whether removing "detached" AppendRelInfos are safe or not. It is far smarter than mine. > An alternative was > to introduce a new RTE flag, say "append". An inheritance parent under a > UNION ALL would have append = false, inh = true; other inheritance parent RTEs > would have append = true, inh = true; an RTE for UNION ALL itself would have > append = true, inh = false. I think that kind of complexity is not necessary for this issue. > > > > > > > The attached two patches are rebased to current 9.4dev HEAD and > > > > > > > make check at the topmost directory and src/test/isolation are > > > > > > > passed without error. One bug was found and fixed on the way. It > > > > > > > was an assertion failure caused by probably unexpected type > > > > > > > conversion introduced by collapse_appendrels() which leads > > > > > > > implicit whole-row cast from some valid reltype to invalid > > > > > > > reltype (0) into adjust_appendrel_attrs_mutator(). > > > > > > > > > > > > What query demonstrated this bug in the previous type 2/3 patches? > > > > > > > > I would still like to know the answer to the above question. > > > > I rememberd after some struggles. It failed during 'make check', > > on the following query in inherit.sql. > > > [details] > > Interesting. Today, the parent_reltype and child_reltype fields of > AppendRelInfo are either both valid or both invalid. Your v6 patch allowed us > to have a valid child_reltype with an invalid parent_reltype. At the moment, > we can't benefit from just one valid reltype. I restored the old invariant. Ok, I understood it. > If the attached patch version looks reasonable, I will commit it. > > > Incidentally, I tried adding an assertion that append_rel_list does not show > one appendrel as a direct child of another. The following query, off-topic > for the patch at hand, triggered that assertion: > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 > UNION ALL > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; This seems not to crash unless this patch is applied, but itself doesn't seem to be a bug. I think it should be cured along with this patch even if it is not the issue of this patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Feb 27, 2014 at 07:30:57PM +0900, Kyotaro HORIGUCHI wrote: > Thank you for the labor for polishing this patch. > > I have no obvious objection at a glance on this new patch. > > I agree to commit this if you this is pertinent to commit except > for the issue newly revealed by this patch. Though could you let > me have some more time to examine this by myself and fully > understand the changes what you made? Yes; waiting several days is no problem. > At Wed, 26 Feb 2014 23:29:35 -0500, Noah Misch wrote > > An alternative was > > to introduce a new RTE flag, say "append". An inheritance parent under a > > UNION ALL would have append = false, inh = true; other inheritance parent RTEs > > would have append = true, inh = true; an RTE for UNION ALL itself would have > > append = true, inh = false. > > I think that kind of complexity is not necessary for this issue. Agreed. > > Incidentally, I tried adding an assertion that append_rel_list does not show > > one appendrel as a direct child of another. The following query, off-topic > > for the patch at hand, triggered that assertion: > > > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 > > UNION ALL > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; > > This seems not to crash unless this patch is applied, but itself > doesn't seem to be a bug. To my knowledge, the query does not crash the server under any patch provided on this thread. > I think it should be cured along with > this patch even if it is not the issue of this patch. That would require changing rather different code, probably something in the vicinity of pull_up_subqueries(). I'll leave it for another patch. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > If the attached patch version looks reasonable, I will commit it. The test case is completely bogus, as the query explained is significantly different from the query executed. I'm not sure whether you can just remove the extra ORDER BY column without getting machine-dependent results, though. More generally, I'm afraid the whole approach is probably wrong, or at least not solving all problems in this area, because of this: > Incidentally, I tried adding an assertion that append_rel_list does not show > one appendrel as a direct child of another. The following query, off-topic > for the patch at hand, triggered that assertion: > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 > UNION ALL > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; That's not "off topic" at all; it shows that there's not been any effort to date to flatten appendrel membership, and therefore this partial implementation is going to miss some cases. It doesn't help to merge an inheritance-based appendrel into its parent if the query ORDER BY is still a level or two above that due to UNION ALLs. I wonder whether we should consider adding a pass to flatten any nested appendrels after we're done creating them all. Or alternatively, perhaps rather than changing the representation invariant, we need to take a harder look at why ordering info isn't getting pushed down through appendrels. regards, tom lane
BTW, isn't the proposed change to the comments for adjust_appendrel_attrs just wrong? If it's correct, why doesn't the Assert(!IsA(node, SubLink)) therein fire? (AFAICS, the existing comment is correct: we don't use this function until after expression preprocessing is complete.) regards, tom lane
On Thu, Feb 27, 2014 at 05:47:16PM -0500, Tom Lane wrote: > BTW, isn't the proposed change to the comments for adjust_appendrel_attrs > just wrong? If it's correct, why doesn't the Assert(!IsA(node, SubLink)) > therein fire? (AFAICS, the existing comment is correct: we don't use > this function until after expression preprocessing is complete.) The comment change is accurate; expand_inherited_tables(), which precedes subplan conversion, now calls adjust_appendrel_attrs(). I neglected to remove the SubLink assert or test a scenario actually entailing a SubLink in translated_vars. Thanks for spotting the problem. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Thu, Feb 27, 2014 at 05:33:47PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > If the attached patch version looks reasonable, I will commit it. > > The test case is completely bogus, as the query explained is significantly > different from the query executed. I'm not sure whether you can just > remove the extra ORDER BY column without getting machine-dependent > results, though. Each query tests something slightly different. The EXPLAIN verifies that we can MergeAppend given this query structure, and the plain SELECT verifies that any join tree contortions we made to achieve that do not change the answer. > More generally, I'm afraid the whole approach is probably wrong, or at > least not solving all problems in this area, because of this: > > > Incidentally, I tried adding an assertion that append_rel_list does not show > > one appendrel as a direct child of another. The following query, off-topic > > for the patch at hand, triggered that assertion: > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 > > UNION ALL > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; > > That's not "off topic" at all; it shows that there's not been any effort > to date to flatten appendrel membership, and therefore this partial > implementation is going to miss some cases. It doesn't help to merge an > inheritance-based appendrel into its parent if the query ORDER BY is still > a level or two above that due to UNION ALLs. Nonetheless, I find it reasonable to accept a patch that makes expand_inherited_tables() avoid introducing nested appendrels. Making pull_up_subqueries() do the same can be a separate effort. There will still be a pile of ways to stifle MergeAppend, including everything that makes is_simple_union_all() return false. Having said that, both you and Kyotaro sound keen to achieve an appendrel flatness invariant in a single patch. If that's the consensus, I'm fine with bouncing this back so it can happen. > I wonder whether we should consider adding a pass to flatten any nested > appendrels after we're done creating them all. We did consider that upthread. It's a valid option, but I remain more inclined to teach pull_up_subqueries() to preserve flatness just like expand_inherited_tables() will. > Or alternatively, perhaps > rather than changing the representation invariant, we need to take a > harder look at why ordering info isn't getting pushed down through > appendrels. That could facilitate MergeAppend in considerably more places, such as UNION ALL containing non-simple branches. I don't have much of a sense concerning what such a patch would entail. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Hello, At Thu, 27 Feb 2014 21:53:52 -0500, Noah Misch wrote > On Thu, Feb 27, 2014 at 05:33:47PM -0500, Tom Lane wrote: > > Noah Misch <noah@leadboat.com> writes: > > > If the attached patch version looks reasonable, I will commit it. > > > > The test case is completely bogus, as the query explained is significantly > > different from the query executed. I'm not sure whether you can just > > remove the extra ORDER BY column without getting machine-dependent > > results, though. > > Each query tests something slightly different. The EXPLAIN verifies that we > can MergeAppend given this query structure, and the plain SELECT verifies that > any join tree contortions we made to achieve that do not change the answer. I think Tom said that the second query can yield the same result even if the 2nd column in orderby is removed so the pertinence of it seems doubtful. Actually the altered query gave me the same result on my workset (CentOS6.5/x86-64). > > More generally, I'm afraid the whole approach is probably wrong, or at > > least not solving all problems in this area, because of this: > > > > > Incidentally, I tried adding an assertion that append_rel_list does not show > > > one appendrel as a direct child of another. The following query, off-topic > > > for the patch at hand, triggered that assertion: > > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0 > > > UNION ALL > > > SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0; > > > > That's not "off topic" at all; it shows that there's not been any effort > > to date to flatten appendrel membership, and therefore this partial > > implementation is going to miss some cases. It doesn't help to merge an > > inheritance-based appendrel into its parent if the query ORDER BY is still > > a level or two above that due to UNION ALLs. > > Nonetheless, I find it reasonable to accept a patch that makes > expand_inherited_tables() avoid introducing nested appendrels. Making > pull_up_subqueries() do the same can be a separate effort. There will still > be a pile of ways to stifle MergeAppend, including everything that makes > is_simple_union_all() return false. Having said that, both you and Kyotaro > sound keen to achieve an appendrel flatness invariant in a single patch. If > that's the consensus, I'm fine with bouncing this back so it can happen. I understood what that query causes and still don't have definite opinion on which is better between fixing them individually and in one go. Nontheless, I'd feel more at ease flattening them altogether regardless of their origin. > > I wonder whether we should consider adding a pass to flatten any nested > > appendrels after we're done creating them all. > > We did consider that upthread. It's a valid option, but I remain more > inclined to teach pull_up_subqueries() to preserve flatness just like > expand_inherited_tables() will. Yes, the old dumped version of typ2 patch did so. It flattened appendrel tree for the query prpoerly. Let me hear the reson you prefer to do so. > > Or alternatively, perhaps > > rather than changing the representation invariant, we need to take a > > harder look at why ordering info isn't getting pushed down through > > appendrels. > > That could facilitate MergeAppend in considerably more places, such as UNION > ALL containing non-simple branches. I don't have much of a sense concerning > what such a patch would entail. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Sorry, I did wrong test. > > > Noah Misch <noah@leadboat.com> writes: > > > > If the attached patch version looks reasonable, I will commit it. > > > > > > The test case is completely bogus, as the query explained is significantly > > > different from the query executed. I'm not sure whether you can just > > > remove the extra ORDER BY column without getting machine-dependent > > > results, though. > > > > Each query tests something slightly different. The EXPLAIN verifies that we > > can MergeAppend given this query structure, and the plain SELECT verifies that > > any join tree contortions we made to achieve that do not change the answer. > > I think Tom said that the second query can yield the same result > even if the 2nd column in orderby is removed so the pertinence of > it seems doubtful. Actually the altered query gave me the same > result on my workset (CentOS6.5/x86-64). No, no! It was actually returned a *different* result. I accidentially(?) added 'desc' to the '2'. Please forget it. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Feb 28, 2014 at 02:35:50PM +0900, Kyotaro HORIGUCHI wrote: > At Thu, 27 Feb 2014 21:53:52 -0500, Noah Misch wrote > > On Thu, Feb 27, 2014 at 05:33:47PM -0500, Tom Lane wrote: > > > I wonder whether we should consider adding a pass to flatten any nested > > > appendrels after we're done creating them all. > > > > We did consider that upthread. It's a valid option, but I remain more > > inclined to teach pull_up_subqueries() to preserve flatness just like > > expand_inherited_tables() will. > > Yes, the old dumped version of typ2 patch did so. It flattened > appendrel tree for the query prpoerly. Let me hear the reson you > prefer to do so. Having reviewed my upthread reasoning for preferring one of those two approaches over the other, it's a weak preference. They have similar runtime costs. Putting the logic with the code that creates appendrels reduces the number of code sites one must examine to reason about possible plan structures. We might not flatten RTE_RELATION appendrel parents exactly the same way we flatten RTE_SUBQUERY appendrel parents. I would tend to leave inh=true for the former, for reasons explained in my notes on v7, but set inh=false for the latter to save needless work. On the other hand, a flattening pass is less code overall and brings an attractive uniformity-by-default to the area. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Hello, > > Yes, the old dumped version of typ2 patch did so. It flattened > > appendrel tree for the query prpoerly. Let me hear the reson you > > prefer to do so. > > Having reviewed my upthread reasoning for preferring one of those two > approaches over the other, it's a weak preference. They have similar runtime > costs. Putting the logic with the code that creates appendrels reduces the > number of code sites one must examine to reason about possible plan > structures. We might not flatten RTE_RELATION appendrel parents exactly the > same way we flatten RTE_SUBQUERY appendrel parents. I would tend to leave > inh=true for the former, for reasons explained in my notes on v7, but set > inh=false for the latter to save needless work. Unfortunately, RTE_SUBQUERY-es with their inh flag cleard might cause something inconvenient in preprocess_minmax_aggregates() and/or add_rtes_to_flat_rtable().. # I haven't found that related to sepgsql, though :-( I understood that your concern is to deal parent RTEs defferently according to their relkinds. But I think the inh flags could not be modified at all for the reason mentioned above. Finally the seemingly most safe way to exclude removed intermediate RTEs in subsequent processing is simplly remove apprelinfos directly linked to them (as did in v7), for both of the parents, RTE_RELATION and RTE_SUBQUERY. The difference has disappeared in this story. At least as of now, inheritance tables define the bottom bound of a appendrel tree and on the other hand complex(?) union_alls define the upper bound, and both multilevel (simple)union_alls and inheritances are flattened individually so all possible inheritance tree to be collapsed by this patch is only, I think, Subquery(inh=1)[Relation-inhparent [Relation-child, child, child]] > On the other hand, a flattening pass is less code overall and > brings an attractive uniformity-by-default to the area. Focusing only on the above structure, what we should do to collapse this tree is only connect the childs to the Subquery directly, then remove all appendrelinfos connecting to the Relation-inhparent. inh flag need not to be modified. # Umm. I strongly suspect that I overlooked something.. Then even widening to general case, the case doesn't seem to change. All we need to do is, link a child to its grandparent and isolate the parent removing apprelinfos. Thoughts? regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Mar 03, 2014 at 07:01:09PM +0900, Kyotaro HORIGUCHI wrote: > > > Yes, the old dumped version of typ2 patch did so. It flattened > > > appendrel tree for the query prpoerly. Let me hear the reson you > > > prefer to do so. > > > > Having reviewed my upthread reasoning for preferring one of those two > > approaches over the other, it's a weak preference. They have similar runtime > > costs. Putting the logic with the code that creates appendrels reduces the > > number of code sites one must examine to reason about possible plan > > structures. We might not flatten RTE_RELATION appendrel parents exactly the > > same way we flatten RTE_SUBQUERY appendrel parents. I would tend to leave > > inh=true for the former, for reasons explained in my notes on v7, but set > > inh=false for the latter to save needless work. > > Unfortunately, RTE_SUBQUERY-es with their inh flag cleard might > cause something inconvenient in preprocess_minmax_aggregates() > and/or add_rtes_to_flat_rtable().. preprocess_minmax_aggregates() only cares about RTEs reachable from the join tree, and the appendrel parents made obsolete by flattening would not be reachable from the join tree. add_rtes_to_flat_rtable() might be unhappy. > # I haven't found that related to sepgsql, though :-( > > I understood that your concern is to deal parent RTEs defferently > according to their relkinds. But I think the inh flags could not > be modified at all for the reason mentioned above. That's fine, then. It was a minor point. If you are convinced that a separate flattening pass is best, that suffices for me at this stage. Please submit the patch you have in mind, incorporating any improvements from the v7 patch that are relevant to both approaches. > At least as of now, inheritance tables define the bottom bound of > a appendrel tree and on the other hand complex(?) union_alls > define the upper bound, and both multilevel (simple)union_alls > and inheritances are flattened individually so all possible > inheritance tree to be collapsed by this patch is only, I think, > > Subquery(inh=1)[Relation-inhparent [Relation-child, child, child]] > > > On the other hand, a flattening pass is less code overall and > > brings an attractive uniformity-by-default to the area. > > Focusing only on the above structure, what we should do to > collapse this tree is only connect the childs to the Subquery > directly, then remove all appendrelinfos connecting to the > Relation-inhparent. inh flag need not to be modified. > > # Umm. I strongly suspect that I overlooked something.. > > Then even widening to general case, the case doesn't seem to > change. All we need to do is, link a child to its grandparent and > isolate the parent removing apprelinfos. I barely follow what you're saying here. Do you speak of union_inh_idx_typ2_v2_20131113.patch, unionall_inh_idx_typ3_v7.patch, or some future design? If we use a separate flattening pass, there's no small limit on how many layers of appendrel we may need to flatten. pull_up_subqueries() can create many nested RTE_SUBQUERY appendrel layers; there may be more than just child, parent and grandparent. There's never more than one layer of RTE_RELATION appendrel, though. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > If you are convinced that a separate flattening pass is best, that suffices > for me at this stage. Please submit the patch you have in mind, incorporating > any improvements from the v7 patch that are relevant to both approaches. I went back and re-read the original message, and this time it struck me that really the issue here is that add_child_rel_equivalences() doesn't think it has to deal with the case of a "parent" rel being itself a child. That's not inherent though; it's just that it didn't occur to me at the time that such a situation could arise. It takes only very small changes to allow that to happen. If you do that, as in the attached changes to equivclass.c, you get "could not find pathkey item to sort" errors from createplan.c; but that's just because create_merge_append_plan() is likewise not expecting the mergeappend's parent rel to be itself a child. Fix for that is a one-liner, ie, pass down relids. That gets you to a point where the code generates a valid plan, but it's got nested MergeAppends, which are unnecessary expense. Kyotaro-san's original fix for that was overcomplicated. It's sufficient to teach accumulate_append_subpath() to flatten nested MergeAppendPaths. In short, the attached patch seems to fix it, for a lot less added complication than anything else that's been discussed on this thread. I've only lightly tested it (it could use a new regression test case), but unless someone can find a flaw I think we should use this approach. regards, tom lane diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 03be7b1..5777cb2 100644 *** a/src/backend/optimizer/path/allpaths.c --- b/src/backend/optimizer/path/allpaths.c *************** get_cheapest_parameterized_child_path(Pl *** 1021,1030 **** * accumulate_append_subpath * Add a subpath to the list being built for an Append or MergeAppend * ! * It's possible that the child is itself an Append path, in which case ! * we can "cut out the middleman" and just add its child paths to our ! * own list. (We don't try to do this earlier because we need to ! * apply both levels of transformation to the quals.) */ static List * accumulate_append_subpath(List *subpaths, Path *path) --- 1021,1035 ---- * accumulate_append_subpath * Add a subpath to the list being built for an Append or MergeAppend * ! * It's possible that the child is itself an Append or MergeAppend path, in ! * which case we can "cut out the middleman" and just add its child paths to ! * our own list. (We don't try to do this earlier because we need to apply ! * both levels of transformation to the quals.) ! * ! * Note that if we omit a child MergeAppend in this way, we are effectively ! * omitting a sort step, which seems fine: if the parent is to be an Append, ! * its result would be unsorted anyway, while if the parent is to be a ! * MergeAppend, there's no point in a separate sort on a child. */ static List * accumulate_append_subpath(List *subpaths, Path *path) *************** accumulate_append_subpath(List *subpaths *** 1036,1041 **** --- 1041,1053 ---- /* list_copy is important here to avoid sharing list substructure */ return list_concat(subpaths, list_copy(apath->subpaths)); } + else if (IsA(path, MergeAppendPath)) + { + MergeAppendPath *mpath = (MergeAppendPath *) path; + + /* list_copy is important here to avoid sharing list substructure */ + return list_concat(subpaths, list_copy(mpath->subpaths)); + } else return lappend(subpaths, path); } diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 35d2a83..ac12f84 100644 *** a/src/backend/optimizer/path/equivclass.c --- b/src/backend/optimizer/path/equivclass.c *************** add_child_rel_equivalences(PlannerInfo * *** 1937,1952 **** if (cur_ec->ec_has_volatile) continue; ! /* No point in searching if parent rel not mentioned in eclass */ ! if (!bms_is_subset(parent_rel->relids, cur_ec->ec_relids)) continue; foreach(lc2, cur_ec->ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); ! if (cur_em->em_is_const || cur_em->em_is_child) ! continue; /* ignore consts and children here */ /* Does it reference parent_rel? */ if (bms_overlap(cur_em->em_relids, parent_rel->relids)) --- 1937,1956 ---- if (cur_ec->ec_has_volatile) continue; ! /* ! * No point in searching if parent rel not mentioned in eclass; but ! * we can't tell that for sure if parent rel is itself a child. ! */ ! if (parent_rel->reloptkind == RELOPT_BASEREL && ! !bms_is_subset(parent_rel->relids, cur_ec->ec_relids)) continue; foreach(lc2, cur_ec->ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); ! if (cur_em->em_is_const) ! continue; /* ignore consts here */ /* Does it reference parent_rel? */ if (bms_overlap(cur_em->em_relids, parent_rel->relids)) diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 184d37a..784805f 100644 *** a/src/backend/optimizer/plan/createplan.c --- b/src/backend/optimizer/plan/createplan.c *************** create_merge_append_plan(PlannerInfo *ro *** 751,757 **** /* Compute sort column info, and adjust MergeAppend's tlist as needed */ (void) prepare_sort_from_pathkeys(root, plan, pathkeys, ! NULL, NULL, true, &node->numCols, --- 751,757 ---- /* Compute sort column info, and adjust MergeAppend's tlist as needed */ (void) prepare_sort_from_pathkeys(root, plan, pathkeys, ! best_path->path.parent->relids, NULL, true, &node->numCols,
Hello, > > Unfortunately, RTE_SUBQUERY-es with their inh flag cleard might > > cause something inconvenient in preprocess_minmax_aggregates() > > and/or add_rtes_to_flat_rtable().. > > preprocess_minmax_aggregates() only cares about RTEs reachable from the join > tree, and the appendrel parents made obsolete by flattening would not be > reachable from the join tree. add_rtes_to_flat_rtable() might be unhappy. > > > # I haven't found that related to sepgsql, though :-( > > > > I understood that your concern is to deal parent RTEs defferently > > according to their relkinds. But I think the inh flags could not > > be modified at all for the reason mentioned above. > > That's fine, then. It was a minor point. Ya, anyway, inh's alterability during inheritance tree transformation is not a minor issue, if we choose flattening rather than equivalance adjustment (what typ1 patch did). > If you are convinced that a separate flattening pass is best, that suffices > for me at this stage. Please submit the patch you have in mind, incorporating > any improvements from the v7 patch that are relevant to both approaches. All right. > > At least as of now, inheritance tables define the bottom bound of > > a appendrel tree and on the other hand complex(?) union_alls > > define the upper bound, and both multilevel (simple)union_alls > > and inheritances are flattened individually so all possible > > inheritance tree to be collapsed by this patch is only, I think, > > > > Subquery(inh=1)[Relation-inhparent [Relation-child, child, child]] > > > > > On the other hand, a flattening pass is less code overall and > > > brings an attractive uniformity-by-default to the area. > > > > Focusing only on the above structure, what we should do to > > collapse this tree is only connect the childs to the Subquery > > directly, then remove all appendrelinfos connecting to the > > Relation-inhparent. inh flag need not to be modified. > > > > # Umm. I strongly suspect that I overlooked something.. > > > > Then even widening to general case, the case doesn't seem to > > change. All we need to do is, link a child to its grandparent and > > isolate the parent removing apprelinfos. > > I barely follow what you're saying here. Do you speak of > union_inh_idx_typ2_v2_20131113.patch, unionall_inh_idx_typ3_v7.patch, or some > future design? Sorry, it's about some future design, but the point is inevitable if we select the 'flattening pass' way. > If we use a separate flattening pass, there's no small limit > on how many layers of appendrel we may need to flatten. pull_up_subqueries() > can create many nested RTE_SUBQUERY appendrel layers; there may be more than > just child, parent and grandparent. There's never more than one layer of > RTE_RELATION appendrel, though. Yes, that is near to my 'general case' I had in my mind something similar in the structure but different in the source to inheritance tables, but talking about such possibilities seems senseless. Now the 'general case' is exactly the same to what you mentioned above. On the other hand, Tom gave us a proposal to deal this in add_child_rel_equivalences(), it is similar at the entry to my old failed typ1 patch but more profound. I'll consider on this for now. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, I examined the your patch and it seemed reasonable, but I have one question about this patch. You made ec_relids differ to the union of all ec members' em_relids. Is it right? ==== At Mon, 03 Mar 2014 14:05:10 -0500, Tom Lane wrote > Noah Misch <noah@leadboat.com> writes: > > If you are convinced that a separate flattening pass is best, that suffices > > for me at this stage. Please submit the patch you have in mind, incorporating > > any improvements from the v7 patch that are relevant to both approaches. > > I went back and re-read the original message, and this time it struck me > that really the issue here is that add_child_rel_equivalences() doesn't > think it has to deal with the case of a "parent" rel being itself a child. Yes, that is what I tried to solve in one of the first patch but the brute way led to failure:( > That's not inherent though; it's just that it didn't occur to me at the > time that such a situation could arise. It takes only very small changes > to allow that to happen. I saw what you did. I marked not-full-fledged eq members as 'not child'(its real meaning there was full-fledged in contradiction) to reach the child rels. In contrast, you loosened the ec_relids shortcut filter and it seems to work but.. Is it right to make ec_relids different to union of all members' em_relids? This seems to affect joins afterwards. > If you do that, as in the attached changes to equivclass.c, you get > "could not find pathkey item to sort" errors from createplan.c; but > that's just because create_merge_append_plan() is likewise not expecting > the mergeappend's parent rel to be itself a child. Fix for that is > a one-liner, ie, pass down relids. This seems to just correcting the over simplification by the assumption that the best_path there cannot have a parent. > That gets you to a point where the code generates a valid plan, but it's > got nested MergeAppends, which are unnecessary expense. Kyotaro-san's > original fix for that was overcomplicated. It's sufficient to teach > accumulate_append_subpath() to flatten nested MergeAppendPaths. Ah, it was in typ1-collapse patch. As you might noticed seeing there, I couldn't put the assumption that directly nested MergeAppendPaths share the same pathkeys (really?). It's no problem if the assumption goes on. > In short, the attached patch seems to fix it, for a lot less added > complication than anything else that's been discussed on this thread. > I've only lightly tested it (it could use a new regression test case), > but unless someone can find a flaw I think we should use this approach. Mmm. That's motifying but you seems to be right :) Equipping this with some regression tests become my work from now. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: > Hello, I examined the your patch and it seemed reasonable, but I > have one question about this patch. > You made ec_relids differ to the union of all ec members' > em_relids. Is it right? ec_relids has never included child relids. regards, tom lane
Hello, I haven't look closer on their relationship. > > Hello, I examined the your patch and it seemed reasonable, but I > > have one question about this patch. > > > You made ec_relids differ to the union of all ec members' > > em_relids. Is it right? > > ec_relids has never included child relids. relation.h says that, | Relids ec_relids; /* all relids appearing in ec_members */ ... | Relids em_relids; /* all relids appearing in em_expr */ But add_eq_member already did this, so the point was whether it's ok to omit intermediate relations... | else if (!is_child) /* child members don't add to ec_relids */ | { | ec->ec_relids = bms_add_members(ec->ec_relids, relids); | } | ec->ec_members = lappend(ec->ec_members, em); I understood that it is ok because an inheritance tree must have one parent and it can be the representative for whole its descendents so it's no problem. Thank you, I'll go on. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: >> ec_relids has never included child relids. > relation.h says that, > | Relids ec_relids; /* all relids appearing in ec_members */ > ... > | Relids em_relids; /* all relids appearing in em_expr */ Ah. Those comments could use improvement, I guess. regards, tom lane
Hello. Attached is the 2nd version of 'pushdown in UNION ALL on partitioned tables' patch type 1 - fix in equiv-member version. As far as I can see, I have found no problem on the original Tom's patch. I have no annoyance of modifying inh flag and so with this. At Tue, 04 Mar 2014 18:57:56 +0900, Kyotaro HORIGUCHI wrote me> Mmm. That's motifying but you seems to be right :) Equipping this me> with some regression tests become my work from now. And I took an advantage of using Noah's regression test after some modifications. After all, this patch consists of work of you all. Thanks for all you did to me. I simplified the query for regression tests so as to clarify the objective and getting rid of confisions of readers. Using only the first column seems to be enough to also make sure of pushing down and ordering. Any comments? At Wed, 05 Mar 2014 13:59:45 -0500, Tom Lane wrote tgl> >> ec_relids has never included child relids. > > | Relids ec_relids; /* all relids appearing in ec_members */ > > ... > > | Relids em_relids; /* all relids appearing in em_expr */ > > Ah. Those comments could use improvement, I guess. The revised comment makes it clear. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 03be7b1..5777cb2 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -1021,10 +1021,15 @@ get_cheapest_parameterized_child_path(PlannerInfo *root, RelOptInfo *rel, * accumulate_append_subpath* Add a subpath to the list being built for an Append or MergeAppend * - * It's possible that the child is itself an Append path, in which case - * we can "cut out the middleman" and just add its child paths to our - * own list. (We don't try to do this earlier because we need to - * apply both levels of transformation to the quals.) + * It's possible that the child is itself an Append or MergeAppend path, in + * which case we can "cut out the middleman" and just add its child paths to + * our own list. (We don't try to do this earlier because we need to apply + * both levels of transformation to the quals.) + * + * Note that if we omit a child MergeAppend in this way, we are effectively + * omitting a sort step, which seems fine: if the parent is to be an Append, + * its result would be unsorted anyway, while if the parent is to be a + * MergeAppend, there's no point in a separate sort on a child. */static List *accumulate_append_subpath(List *subpaths,Path *path) @@ -1036,6 +1041,13 @@ accumulate_append_subpath(List *subpaths, Path *path) /* list_copy is important here to avoidsharing list substructure */ return list_concat(subpaths, list_copy(apath->subpaths)); } + else if (IsA(path, MergeAppendPath)) + { + MergeAppendPath *mpath = (MergeAppendPath *) path; + + /* list_copy is important here to avoid sharing list substructure */ + return list_concat(subpaths, list_copy(mpath->subpaths)); + } else return lappend(subpaths, path);} diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 35d2a83..ac12f84 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1937,16 +1937,20 @@ add_child_rel_equivalences(PlannerInfo *root, if (cur_ec->ec_has_volatile) continue; - /* No point in searching if parent rel not mentioned in eclass */ - if (!bms_is_subset(parent_rel->relids, cur_ec->ec_relids)) + /* + * No point in searching if parent rel not mentioned in eclass; but + * we can't tell that for sure if parent rel is itself a child. + */ + if (parent_rel->reloptkind == RELOPT_BASEREL && + !bms_is_subset(parent_rel->relids, cur_ec->ec_relids)) continue; foreach(lc2, cur_ec->ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); - if (cur_em->em_is_const || cur_em->em_is_child) - continue; /* ignore consts and children here */ + if (cur_em->em_is_const) + continue; /* ignore consts here */ /* Does it reference parent_rel? */ if (bms_overlap(cur_em->em_relids,parent_rel->relids)) diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 184d37a..784805f 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -751,7 +751,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path) /* Compute sort column info,and adjust MergeAppend's tlist as needed */ (void) prepare_sort_from_pathkeys(root, plan, pathkeys, - NULL, + best_path->path.parent->relids, NULL, true, &node->numCols, diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 6f9ee5e..68643d8 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -502,9 +502,56 @@ explain (costs off) Index Cond: (ab = 'ab'::text)(7 rows) +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- +CREATE TEMP TABLE t1c (b text, a text); +ALTER TABLE t1c INHERIT t1; +CREATE TEMP TABLE t2c (primary key (ab)) INHERITS (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'), ('m', 'n'), ('e', 'f'); +INSERT INTO t2c VALUES ('vw'), ('cd'), ('mn'), ('ef'); +CREATE INDEX t1c_ab_idx on t1c ((a || b)); +set enable_seqscan = on; +set enable_indexonlyscan = off; +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT ab FROM t2) t + ORDER BY 1 LIMIT 8; + QUERY PLAN +------------------------------------------------ + Limit + -> Merge Append + Sort Key: ((t1.a || t1.b)) + -> Index Scan using t1_ab_idx on t1 + -> Index Scan using t1c_ab_idx on t1c + -> Index Scan using t2_pkey on t2 + -> Index Scan using t2c_pkey on t2c +(7 rows) + + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT ab FROM t2) t + ORDER BY 1 LIMIT 8; + ab +---- + ab + ab + cd + dc + ef + fe + mn + nm +(8 rows) +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off) SELECT * FROM diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index d567cf1..aa3ef7e 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -198,9 +198,37 @@ explain (costs off) SELECT * FROM t2) t WHERE ab = 'ab'; +-- +-- Test that ORDER BY for UNION ALL can be pushed down on inheritance +-- tables. +-- + +CREATE TEMP TABLE t1c (b text, a text); +ALTER TABLE t1c INHERIT t1; +CREATE TEMP TABLE t2c (primary key (ab)) INHERITS (t2); +INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'), ('m', 'n'), ('e', 'f'); +INSERT INTO t2c VALUES ('vw'), ('cd'), ('mn'), ('ef'); +CREATE INDEX t1c_ab_idx on t1c ((a || b)); +set enable_seqscan = on; +set enable_indexonlyscan = off; + +explain (costs off) + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT ab FROM t2) t + ORDER BY 1 LIMIT 8; + + SELECT * FROM + (SELECT a || b AS ab FROM t1 + UNION ALL + SELECT ab FROM t2) t + ORDER BY 1 LIMIT 8; +reset enable_seqscan;reset enable_indexscan;reset enable_bitmapscan; +reset enable_indexonlyscan;-- Test constraint exclusion of UNION ALL subqueriesexplain (costs off)
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: > Hello. Attached is the 2nd version of 'pushdown in UNION ALL on > partitioned tables' patch type 1 - fix in equiv-member version. Committed, thanks. regards, tom lane
Thank you for committing. At Fri, 28 Mar 2014 11:50:56 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <21426.1396021856@sss.pgh.pa.us> tgl> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes: tgl> > Hello. Attached is the 2nd version of 'pushdown in UNION ALL on tgl> > partitioned tables' patch type 1 - fix in equiv-member version. tgl> tgl> Committed, thanks. tgl> tgl> regards, tom lane regards, -- Kyotaro Horiguchi NTT Open Source Software Center