Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN
Date
Msg-id 2979923.1687190824@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN
List pgsql-bugs
Richard Guo <guofenglinux@gmail.com> writes:
> FWIW, I found that we have an existing test query in sql/join.sql that
> almost exposes this issue.

> explain (costs off)
> select * from int4_tbl t1
>   left join ((select t2.f1 from int4_tbl t2
>                 left join int4_tbl t3 on t2.f1 > 0
>                 where t3.f1 is null) s
>              left join tenk1 t4 on s.f1 > 1)
>     on s.f1 = t1.f1;

> If we change the WHERE clause to 't2.f1 != coalesce(t3.f1, 1)', we will
> see this issue.

Interesting that you should bring up that query, because after
quick-hacking a fix in remove_useless_results(), all the existing
test cases pass except that that one changes plan: it now does the
joins in syntactic order, whereas HEAD finds a plan that joins t4
last.  Joining t4 last has a noticeably better cost estimate than
the syntactic-order plan, so it's kind of sad that we can't find
it anymore.  On the other hand, v15 and before don't find it either,
so this wouldn't be a regression.

According to inspection of the SpecialJoinInfos, HEAD believes
that the t2/t3 outer join of that query could commute according
to identity 3 with the outermost left join between t1 and t2.
It doesn't act on that observation, and I'm not sure it's even
correct to believe that.  This patch doesn't believe that anymore,
but it still believes that the t1/t2 join could commute with the
t2/t4 join.  Why it no longer acts on that is unclear.  There is
probably some buglet or missed opportunity somewhere.

Even odder is that given your modified query in which the WHERE
clause mentions both t2 and t3, it *does* find the plan that
joins t4 last, as shown in the new test case.  This makes me
wonder if it's something triggered by the avoid-clauseless-joins
heuristic, though I can't quite see how that'd apply.

Anyway, what I'm inclined to do is flesh out the attached by updating
the comments for remove_useless_results() and then push it.  Later
on we can look for why it's not finding the better join order; that's
a separable issue, and if it is about avoid-clauseless-joins then we
might choose to live with it rather than incur a lot of planner cost
to fix it.

            regards, tom lane

diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 2f589b1b99..0091eaea23 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -3322,7 +3322,9 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode,
          */
         j->larg = remove_useless_results_recurse(root, j->larg,
                                                  (j->jointype == JOIN_INNER) ?
-                                                 &j->quals : NULL,
+                                                 &j->quals :
+                                                 (j->jointype == JOIN_LEFT) ?
+                                                 parent_quals : NULL,
                                                  dropped_outer_joins);
         j->rarg = remove_useless_results_recurse(root, j->rarg,
                                                  (j->jointype == JOIN_INNER ||
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cc4c122fdd..f7a34b76f8 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2531,19 +2531,19 @@ select * from int4_tbl t1
     on s.f1 = t1.f1;
                    QUERY PLAN
 -------------------------------------------------
- Nested Loop Left Join
-   Join Filter: (t2.f1 > 1)
-   ->  Hash Right Join
-         Hash Cond: (t2.f1 = t1.f1)
+ Hash Right Join
+   Hash Cond: (t2.f1 = t1.f1)
+   ->  Nested Loop Left Join
+         Join Filter: (t2.f1 > 1)
          ->  Nested Loop Left Join
                Join Filter: (t2.f1 > 0)
                Filter: (t3.f1 IS NULL)
                ->  Seq Scan on int4_tbl t2
                ->  Materialize
                      ->  Seq Scan on int4_tbl t3
-         ->  Hash
-               ->  Seq Scan on int4_tbl t1
-   ->  Seq Scan on tenk1 t4
+         ->  Seq Scan on tenk1 t4
+   ->  Hash
+         ->  Seq Scan on int4_tbl t1
 (13 rows)

 explain (costs off)
@@ -2628,6 +2628,31 @@ select * from onek t1
                            Filter: (two = t2.two)
 (11 rows)

+explain (costs off)
+select * from int4_tbl t1
+  left join ((select t2.f1 from int4_tbl t2
+                left join int4_tbl t3 on t2.f1 > 0
+                where t2.f1 <> coalesce(t3.f1, -1)) s
+             left join tenk1 t4 on s.f1 > 1)
+    on s.f1 = t1.f1;
+                           QUERY PLAN
+-----------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (t2.f1 > 1)
+   ->  Hash Right Join
+         Hash Cond: (t2.f1 = t1.f1)
+         ->  Nested Loop Left Join
+               Join Filter: (t2.f1 > 0)
+               Filter: (t2.f1 <> COALESCE(t3.f1, '-1'::integer))
+               ->  Seq Scan on int4_tbl t2
+               ->  Materialize
+                     ->  Seq Scan on int4_tbl t3
+         ->  Hash
+               ->  Seq Scan on int4_tbl t1
+   ->  Materialize
+         ->  Seq Scan on tenk1 t4
+(14 rows)
+
 --
 -- check a case where we formerly got confused by conflicting sort orders
 -- in redundant merge join path keys
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index e77e469570..3683a562c2 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -528,6 +528,14 @@ select * from onek t1
       (select * from onek t3 where t3.two = t2.two offset 0) s
       on t2.unique1 = 1;

+explain (costs off)
+select * from int4_tbl t1
+  left join ((select t2.f1 from int4_tbl t2
+                left join int4_tbl t3 on t2.f1 > 0
+                where t2.f1 <> coalesce(t3.f1, -1)) s
+             left join tenk1 t4 on s.f1 > 1)
+    on s.f1 = t1.f1;
+
 --
 -- check a case where we formerly got confused by conflicting sort orders
 -- in redundant merge join path keys

pgsql-bugs by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: BUG #17981: HY000 server closed the connection unexpectedly
Next
From: Tom Lane
Date:
Subject: Re: BUG #17982: Inconsistent results of SELECT with CTE caused by subquery comparison