Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN |
Date | |
Msg-id | 2979923.1687190824@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN (Richard Guo <guofenglinux@gmail.com>) |
Responses |
Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN
|
List | pgsql-bugs |
Richard Guo <guofenglinux@gmail.com> writes: > FWIW, I found that we have an existing test query in sql/join.sql that > almost exposes this issue. > explain (costs off) > select * from int4_tbl t1 > left join ((select t2.f1 from int4_tbl t2 > left join int4_tbl t3 on t2.f1 > 0 > where t3.f1 is null) s > left join tenk1 t4 on s.f1 > 1) > on s.f1 = t1.f1; > If we change the WHERE clause to 't2.f1 != coalesce(t3.f1, 1)', we will > see this issue. Interesting that you should bring up that query, because after quick-hacking a fix in remove_useless_results(), all the existing test cases pass except that that one changes plan: it now does the joins in syntactic order, whereas HEAD finds a plan that joins t4 last. Joining t4 last has a noticeably better cost estimate than the syntactic-order plan, so it's kind of sad that we can't find it anymore. On the other hand, v15 and before don't find it either, so this wouldn't be a regression. According to inspection of the SpecialJoinInfos, HEAD believes that the t2/t3 outer join of that query could commute according to identity 3 with the outermost left join between t1 and t2. It doesn't act on that observation, and I'm not sure it's even correct to believe that. This patch doesn't believe that anymore, but it still believes that the t1/t2 join could commute with the t2/t4 join. Why it no longer acts on that is unclear. There is probably some buglet or missed opportunity somewhere. Even odder is that given your modified query in which the WHERE clause mentions both t2 and t3, it *does* find the plan that joins t4 last, as shown in the new test case. This makes me wonder if it's something triggered by the avoid-clauseless-joins heuristic, though I can't quite see how that'd apply. Anyway, what I'm inclined to do is flesh out the attached by updating the comments for remove_useless_results() and then push it. Later on we can look for why it's not finding the better join order; that's a separable issue, and if it is about avoid-clauseless-joins then we might choose to live with it rather than incur a lot of planner cost to fix it. regards, tom lane diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index 2f589b1b99..0091eaea23 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3322,7 +3322,9 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode, */ j->larg = remove_useless_results_recurse(root, j->larg, (j->jointype == JOIN_INNER) ? - &j->quals : NULL, + &j->quals : + (j->jointype == JOIN_LEFT) ? + parent_quals : NULL, dropped_outer_joins); j->rarg = remove_useless_results_recurse(root, j->rarg, (j->jointype == JOIN_INNER || diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index cc4c122fdd..f7a34b76f8 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -2531,19 +2531,19 @@ select * from int4_tbl t1 on s.f1 = t1.f1; QUERY PLAN ------------------------------------------------- - Nested Loop Left Join - Join Filter: (t2.f1 > 1) - -> Hash Right Join - Hash Cond: (t2.f1 = t1.f1) + Hash Right Join + Hash Cond: (t2.f1 = t1.f1) + -> Nested Loop Left Join + Join Filter: (t2.f1 > 1) -> Nested Loop Left Join Join Filter: (t2.f1 > 0) Filter: (t3.f1 IS NULL) -> Seq Scan on int4_tbl t2 -> Materialize -> Seq Scan on int4_tbl t3 - -> Hash - -> Seq Scan on int4_tbl t1 - -> Seq Scan on tenk1 t4 + -> Seq Scan on tenk1 t4 + -> Hash + -> Seq Scan on int4_tbl t1 (13 rows) explain (costs off) @@ -2628,6 +2628,31 @@ select * from onek t1 Filter: (two = t2.two) (11 rows) +explain (costs off) +select * from int4_tbl t1 + left join ((select t2.f1 from int4_tbl t2 + left join int4_tbl t3 on t2.f1 > 0 + where t2.f1 <> coalesce(t3.f1, -1)) s + left join tenk1 t4 on s.f1 > 1) + on s.f1 = t1.f1; + QUERY PLAN +----------------------------------------------------------------- + Nested Loop Left Join + Join Filter: (t2.f1 > 1) + -> Hash Right Join + Hash Cond: (t2.f1 = t1.f1) + -> Nested Loop Left Join + Join Filter: (t2.f1 > 0) + Filter: (t2.f1 <> COALESCE(t3.f1, '-1'::integer)) + -> Seq Scan on int4_tbl t2 + -> Materialize + -> Seq Scan on int4_tbl t3 + -> Hash + -> Seq Scan on int4_tbl t1 + -> Materialize + -> Seq Scan on tenk1 t4 +(14 rows) + -- -- check a case where we formerly got confused by conflicting sort orders -- in redundant merge join path keys diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index e77e469570..3683a562c2 100644 --- a/src/test/regress/sql/join.sql +++ b/src/test/regress/sql/join.sql @@ -528,6 +528,14 @@ select * from onek t1 (select * from onek t3 where t3.two = t2.two offset 0) s on t2.unique1 = 1; +explain (costs off) +select * from int4_tbl t1 + left join ((select t2.f1 from int4_tbl t2 + left join int4_tbl t3 on t2.f1 > 0 + where t2.f1 <> coalesce(t3.f1, -1)) s + left join tenk1 t4 on s.f1 > 1) + on s.f1 = t1.f1; + -- -- check a case where we formerly got confused by conflicting sort orders -- in redundant merge join path keys
pgsql-bugs by date: