Re: Making Vars outer-join aware - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Making Vars outer-join aware
Date
Msg-id CAMbWs480igxj6hTbvD0WEnzT4XJ8hJXg71EVJO_BcQqN90MQuA@mail.gmail.com
Whole thread Raw
In response to Re: Making Vars outer-join aware  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Making Vars outer-join aware
List pgsql-hackers

On Tue, Aug 2, 2022 at 3:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Here's a rebase up to HEAD, mostly to placate the cfbot.
I accounted for d8e34fa7a (s/all_baserels/all_query_rels/
in those places) and made one tiny bug-fix change.
Nothing substantive as yet.

When we add required PlaceHolderVars to a join rel's targetlist, if the
PHV can be computed in the nullable-side of the join, we would add the
join's RT index to phnullingrels. This is correct as we know the PHV's
value can be nulled at this join. But I'm wondering if it is necessary
since we have already propagated any varnullingrels into the PHV when we
apply pullup variable replacement in perform_pullup_replace_vars().

On the other hand, the phnullingrels may contain RT indexes of outer
joins above this join level. It seems not good to add such a PHV to the
joinrel's targetlist. Below is an example:

# explain (verbose, costs off) select a.i, ss.jj from a left join (b left join (select c.i, coalesce(c.j, 1) as jj from c) ss on b.i = ss.i) on true;
                       QUERY PLAN
---------------------------------------------------------
 Nested Loop Left Join
   Output: a.i, (COALESCE(c.j, 1))
   ->  Seq Scan on public.a
         Output: a.i, a.j
   ->  Materialize
         Output: (COALESCE(c.j, 1))
         ->  Hash Left Join
               Output: (COALESCE(c.j, 1))
               Hash Cond: (b.i = c.i)
               ->  Seq Scan on public.b
                     Output: b.i, b.j
               ->  Hash
                     Output: c.i, (COALESCE(c.j, 1))
                     ->  Seq Scan on public.c
                           Output: c.i, COALESCE(c.j, 1)
(15 rows)

In this query, for the joinrel {B, C}, the PHV in its targetlist has a
phnullingrels that contains the join of {A} and {BC}, which is confusing
because we have not reached that join level.

I tried the changes below to illustrate the two issues. The assertion
holds true during regression tests and the error pops up for the query
above.

--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -464,18 +464,20 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
                       {
                               if (sjinfo->jointype == JOIN_FULL && sjinfo->ojrelid != 0)
                               {
-                                      /* PHV's value can be nulled at this join */
-                                      phv->phnullingrels = bms_add_member(phv->phnullingrels,
-                                                                                                              sjinfo->ojrelid);
+                                      Assert(bms_is_member(sjinfo->ojrelid, phv->phnullingrels));
+
+                                      if (!bms_is_subset(phv->phnullingrels, joinrel->relids))
+                                              elog(ERROR, "phnullingrels is not subset of joinrel's relids");
                               }
                       }
                       else if (bms_is_subset(phinfo->ph_eval_at, inner_rel->relids))
                       {
                               if (sjinfo->jointype != JOIN_INNER && sjinfo->ojrelid != 0)
                               {
-                                      /* PHV's value can be nulled at this join */
-                                      phv->phnullingrels = bms_add_member(phv->phnullingrels,
-                                                                                                              sjinfo->ojrelid);
+                                      Assert(bms_is_member(sjinfo->ojrelid, phv->phnullingrels));
+
+                                      if (!bms_is_subset(phv->phnullingrels, joinrel->relids))
+                                              elog(ERROR, "phnullingrels is not subset of joinrel's relids");
                               }
                       }


If the two issues are indeed something we need to fix, maybe we can
change add_placeholders_to_joinrel() to search the PHVs from
outer_rel/inner_rel's targetlist, and add the ojrelid to phnullingrels
if needed, just like what we do in build_joinrel_tlist(). The PHVs there
should have the correct phnullingrels (at least the PHVs in base rels'
targetlists have correctly set phnullingrels to NULL).

Thanks
Richard

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Remove remaining mentions of UNSAFE_STAT_OK
Next
From: Thomas Munro
Date:
Subject: Re: Cleaning up historical portability baggage