Re: BUG #15795: ERROR: could not find pathkey item to sort - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #15795: ERROR: could not find pathkey item to sort |
Date | |
Msg-id | 26224.1557429860@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #15795: ERROR: could not find pathkey item to sort (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-bugs |
I wrote: > Now, how is that happening given that convert_subquery_pathkeys > is specifically checking that the column is emitted by the subquery? > The reason is that *it's checking the wrong thing*. It's looking > at what the native output of the subquery is, not at what the > SubqueryScan that we're going to stack atop it will produce. > ... > I haven't decided what to do about this. The minimally invasive > fix would be to teach convert_subquery_pathkeys that it can't emit > anything not listed in rel->reltarget. That seems like it might > lose useful information, though perhaps the consequences are minimal > given how hard it is to hit this bug. I concluded that that should be a reasonable fix. The outer query doesn't really have any use for sort keys that are not relevant to either mergejoins or sorting/grouping, and in either of those cases the sort keys would have been seen to be needed above the scan level so they'd be included in the reltarget list. Hence, attached is a draft patch for that part against v10. (I started there since we don't have a test case to show this bug in HEAD.) I realized that the existing tests for !resjunk are just a half-baked version of "is the column available in the outer query", so I folded those into the improved checks. Notice that with this in place, we don't get the funny extra sort key in the MergeAppend, since convert_subquery_pathkeys doesn't try to create that pathkey. That's good for plan quality I guess but it means that we have no live test case for the bug in create_merge_append_plan --- which is still a bug nonetheless. Anyway, I plan to go ahead with applying the combination of these fixes. If we find better test cases we can add them later, but right now I'm not that hopeful about that. regards, tom lane diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 66c13a7..68431f1 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -29,6 +29,7 @@ static bool pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys); +static Var *find_var_for_subquery_tle(RelOptInfo *rel, TargetEntry *tle); static bool right_merge_direction(PlannerInfo *root, PathKey *pathkey); @@ -599,9 +600,11 @@ build_expression_pathkey(PlannerInfo *root, * 'subquery_pathkeys': the subquery's output pathkeys, in its terms. * 'subquery_tlist': the subquery's output targetlist, in its terms. * - * It is not necessary for caller to do truncate_useless_pathkeys(), - * because we select keys in a way that takes usefulness of the keys into - * account. + * We intentionally don't do truncate_useless_pathkeys() here, because there + * are situations where seeing the raw ordering of the subquery is helpful. + * For example, if it returns ORDER BY x DESC, that may prompt us to + * construct a mergejoin using DESC order rather than ASC order; but the + * right_merge_direction heuristic would have us throw the knowledge away. */ List * convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, @@ -627,22 +630,22 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, * that same targetlist entry. */ TargetEntry *tle; + Var *outer_var; if (sub_eclass->ec_sortref == 0) /* can't happen */ elog(ERROR, "volatile EquivalenceClass has no sortref"); tle = get_sortgroupref_tle(sub_eclass->ec_sortref, subquery_tlist); Assert(tle); - /* resjunk items aren't visible to outer query */ - if (!tle->resjunk) + /* Is TLE actually available to the outer query? */ + outer_var = find_var_for_subquery_tle(rel, tle); + if (outer_var) { /* We can represent this sub_pathkey */ EquivalenceMember *sub_member; - Expr *outer_expr; EquivalenceClass *outer_ec; Assert(list_length(sub_eclass->ec_members) == 1); sub_member = (EquivalenceMember *) linitial(sub_eclass->ec_members); - outer_expr = (Expr *) makeVarFromTargetEntry(rel->relid, tle); /* * Note: it might look funny to be setting sortref = 0 for a @@ -656,7 +659,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, */ outer_ec = get_eclass_for_sort_expr(root, - outer_expr, + (Expr *) outer_var, NULL, sub_eclass->ec_opfamilies, sub_member->em_datatype, @@ -713,14 +716,15 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, foreach(k, subquery_tlist) { TargetEntry *tle = (TargetEntry *) lfirst(k); + Var *outer_var; Expr *tle_expr; - Expr *outer_expr; EquivalenceClass *outer_ec; PathKey *outer_pk; int score; - /* resjunk items aren't visible to outer query */ - if (tle->resjunk) + /* Is TLE actually available to the outer query? */ + outer_var = find_var_for_subquery_tle(rel, tle); + if (!outer_var) continue; /* @@ -735,16 +739,9 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, if (!equal(tle_expr, sub_expr)) continue; - /* - * Build a representation of this targetlist entry as an - * outer Var. - */ - outer_expr = (Expr *) makeVarFromTargetEntry(rel->relid, - tle); - - /* See if we have a matching EC for that */ + /* See if we have a matching EC for the TLE */ outer_ec = get_eclass_for_sort_expr(root, - outer_expr, + (Expr *) outer_var, NULL, sub_eclass->ec_opfamilies, sub_expr_type, @@ -802,6 +799,41 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, } /* + * find_var_for_subquery_tle + * + * If the given subquery tlist entry is due to be emitted by the subquery's + * scan node, return a Var for it, else return NULL. + * + * We need this to ensure that we don't return pathkeys describing values + * that are unavailable above the level of the subquery scan. + */ +static Var * +find_var_for_subquery_tle(RelOptInfo *rel, TargetEntry *tle) +{ + ListCell *lc; + + /* If the TLE is resjunk, it's certainly not visible to the outer query */ + if (tle->resjunk) + return NULL; + + /* Search the rel's targetlist to see what it will return */ + foreach(lc, rel->reltarget->exprs) + { + Var *var = (Var *) lfirst(lc); + + /* Ignore placeholders */ + if (!IsA(var, Var)) + continue; + Assert(var->varno == rel->relid); + + /* If we find a Var referencing this TLE, we're good */ + if (var->varattno == tle->resno) + return copyObject(var); /* Make a copy for safety */ + } + return NULL; +} + +/* * build_join_pathkeys * Build the path keys for a join relation constructed by mergejoin or * nestloop join. This is normally the same as the outer path's keys. diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out index 92d427a..48c27f8 100644 --- a/src/test/regress/expected/union.out +++ b/src/test/regress/expected/union.out @@ -916,6 +916,79 @@ ORDER BY x; 2 | 4 (1 row) +-- Test cases where the native ordering of a sub-select has more pathkeys +-- than the outer query cares about +explain (costs off) +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where q2 = q2; + QUERY PLAN +-------------------------------------------------------- + Unique + -> Merge Append + Sort Key: "*SELECT* 1".q1 + -> Subquery Scan on "*SELECT* 1" + -> Unique + -> Sort + Sort Key: i81.q1, i81.q2 + -> Seq Scan on int8_tbl i81 + Filter: (q2 = q2) + -> Subquery Scan on "*SELECT* 2" + -> Unique + -> Sort + Sort Key: i82.q1, i82.q2 + -> Seq Scan on int8_tbl i82 + Filter: (q2 = q2) +(15 rows) + +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where q2 = q2; + q1 +------------------ + 123 + 4567890123456789 +(2 rows) + +explain (costs off) +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where -q1 = q2; + QUERY PLAN +-------------------------------------------------------- + Unique + -> Merge Append + Sort Key: "*SELECT* 1".q1 + -> Subquery Scan on "*SELECT* 1" + -> Unique + -> Sort + Sort Key: i81.q1, i81.q2 + -> Seq Scan on int8_tbl i81 + Filter: ((- q1) = q2) + -> Subquery Scan on "*SELECT* 2" + -> Unique + -> Sort + Sort Key: i82.q1, i82.q2 + -> Seq Scan on int8_tbl i82 + Filter: ((- q1) = q2) +(15 rows) + +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where -q1 = q2; + q1 +------------------ + 4567890123456789 +(1 row) + -- Test proper handling of parameterized appendrel paths when the -- potential join qual is expensive create function expensivefunc(int) returns int diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql index eed7c8d..5f4881d 100644 --- a/src/test/regress/sql/union.sql +++ b/src/test/regress/sql/union.sql @@ -379,6 +379,34 @@ SELECT * FROM WHERE x > 3 ORDER BY x; +-- Test cases where the native ordering of a sub-select has more pathkeys +-- than the outer query cares about +explain (costs off) +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where q2 = q2; + +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where q2 = q2; + +explain (costs off) +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where -q1 = q2; + +select distinct q1 from + (select distinct * from int8_tbl i81 + union all + select distinct * from int8_tbl i82) ss +where -q1 = q2; + -- Test proper handling of parameterized appendrel paths when the -- potential join qual is expensive create function expensivefunc(int) returns int
pgsql-bugs by date: