Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
Date
Msg-id 891440.1699483495@sss.pgh.pa.us
Whole thread Raw
In response to Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500  (Jian Guo <gjian@vmware.com>)
Responses Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
List pgsql-hackers
Jian Guo <gjian@vmware.com> writes:
> I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch
here.The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong.
Thispatch examines variables for Vars inside CTE, which avoided the bad assumption, so the results could be much more
accurate.

You have the right general idea, but there is nothing about this patch
that's right in detail.  The outer Var doesn't refer to any particular
RTE within the subquery; it refers to a targetlist entry.  You have to
drill down to that, see if it's a Var, and if so you can recurse into
the subroot with that Var.  As this stands, it might accidentally get
the right answer for "SELECT * FROM foo" subqueries, but it will get
the wrong answer or even crash for anything that's not that.

The existing RTE_SUBQUERY stanza has most of what we need for this,
so I experimented with extending that to also handle RTE_CTE.  It
seems to work, though I soon found out that it needed tweaking for
the case where the CTE is INSERT/UPDATE/DELETE RETURNING.

Interestingly, this does not change any existing regression test
results.  I'd supposed there might be at least one place with a
visible plan change, but nope.  Running a coverage test does show
that the new code paths are exercised, but I wonder if we ought
to try to devise a regression test that proves it more directly.

            regards, tom lane

PS: please, please, please do not quote the entire damn thread
when replying.  Trim it to just a minimum amount of relevant
text.  You think people want to read all that again?

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c4fcd0076e..196f50b241 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5493,13 +5493,17 @@ examine_simple_variable(PlannerInfo *root, Var *var,
             vardata->acl_ok = true;
         }
     }
-    else if (rte->rtekind == RTE_SUBQUERY && !rte->inh)
+    else if ((rte->rtekind == RTE_SUBQUERY && !rte->inh) ||
+             (rte->rtekind == RTE_CTE && !rte->self_reference))
     {
         /*
-         * Plain subquery (not one that was converted to an appendrel).
+         * Plain subquery (not one that was converted to an appendrel) or
+         * non-recursive CTE.  In either case, we can try to find out what the
+         * Var refers to within the subquery.
          */
-        Query       *subquery = rte->subquery;
-        RelOptInfo *rel;
+        Query       *subquery;
+        PlannerInfo *subroot;
+        List       *subtlist;
         TargetEntry *ste;

         /*
@@ -5508,6 +5512,80 @@ examine_simple_variable(PlannerInfo *root, Var *var,
         if (var->varattno == InvalidAttrNumber)
             return;

+        /*
+         * Otherwise, find the subquery's query tree and planner subroot.
+         */
+        if (rte->rtekind == RTE_SUBQUERY)
+        {
+            RelOptInfo *rel;
+
+            /*
+             * Fetch RelOptInfo for subquery.  Note that we don't change the
+             * rel returned in vardata, since caller expects it to be a rel of
+             * the caller's query level.  Because we might already be
+             * recursing, we can't use that rel pointer either, but have to
+             * look up the Var's rel afresh.
+             */
+            rel = find_base_rel(root, var->varno);
+
+            subquery = rte->subquery;
+            subroot = rel->subroot;
+        }
+        else
+        {
+            /* CTE case is more difficult */
+            PlannerInfo *cteroot;
+            Index        levelsup;
+            int            ndx;
+            int            plan_id;
+            ListCell   *lc;
+
+            /*
+             * Find the referenced CTE, and locate the subroot previously made
+             * for it.
+             */
+            levelsup = rte->ctelevelsup;
+            cteroot = root;
+            while (levelsup-- > 0)
+            {
+                cteroot = cteroot->parent_root;
+                if (!cteroot)    /* shouldn't happen */
+                    elog(ERROR, "bad levelsup for CTE \"%s\"", rte->ctename);
+            }
+
+            /*
+             * Note: cte_plan_ids can be shorter than cteList, if we are still
+             * working on planning the CTEs (ie, this is a side-reference from
+             * another CTE).  So we mustn't use forboth here.
+             */
+            ndx = 0;
+            subquery = NULL;
+            foreach(lc, cteroot->parse->cteList)
+            {
+                CommonTableExpr *cte = (CommonTableExpr *) lfirst(lc);
+
+                if (strcmp(cte->ctename, rte->ctename) == 0)
+                {
+                    subquery = castNode(Query, cte->ctequery);
+                    break;
+                }
+                ndx++;
+            }
+            if (subquery == NULL)    /* shouldn't happen */
+                elog(ERROR, "could not find CTE \"%s\"", rte->ctename);
+            if (ndx >= list_length(cteroot->cte_plan_ids))
+                elog(ERROR, "could not find plan for CTE \"%s\"", rte->ctename);
+            plan_id = list_nth_int(cteroot->cte_plan_ids, ndx);
+            if (plan_id <= 0)
+                elog(ERROR, "no plan was made for CTE \"%s\"", rte->ctename);
+            subroot = list_nth(root->glob->subroots, plan_id - 1);
+        }
+
+        /* If the subquery hasn't been planned yet, we have to punt */
+        if (subroot == NULL)
+            return;
+        Assert(IsA(subroot, PlannerInfo));
+
         /*
          * Punt if subquery uses set operations or GROUP BY, as these will
          * mash underlying columns' stats beyond recognition.  (Set ops are
@@ -5521,20 +5599,6 @@ examine_simple_variable(PlannerInfo *root, Var *var,
             subquery->groupingSets)
             return;

-        /*
-         * OK, fetch RelOptInfo for subquery.  Note that we don't change the
-         * rel returned in vardata, since caller expects it to be a rel of the
-         * caller's query level.  Because we might already be recursing, we
-         * can't use that rel pointer either, but have to look up the Var's
-         * rel afresh.
-         */
-        rel = find_base_rel(root, var->varno);
-
-        /* If the subquery hasn't been planned yet, we have to punt */
-        if (rel->subroot == NULL)
-            return;
-        Assert(IsA(rel->subroot, PlannerInfo));
-
         /*
          * Switch our attention to the subquery as mangled by the planner. It
          * was okay to look at the pre-planning version for the tests above,
@@ -5543,11 +5607,15 @@ examine_simple_variable(PlannerInfo *root, Var *var,
          * planning, Vars in the targetlist might have gotten replaced, and we
          * need to see the replacement expressions.
          */
-        subquery = rel->subroot->parse;
+        subquery = subroot->parse;
         Assert(IsA(subquery, Query));

         /* Get the subquery output expression referenced by the upper Var */
-        ste = get_tle_by_resno(subquery->targetList, var->varattno);
+        if (subquery->returningList)
+            subtlist = subquery->returningList;
+        else
+            subtlist = subquery->targetList;
+        ste = get_tle_by_resno(subtlist, var->varattno);
         if (ste == NULL || ste->resjunk)
             elog(ERROR, "subquery %s does not have attribute %d",
                  rte->eref->aliasname, var->varattno);
@@ -5595,16 +5663,16 @@ examine_simple_variable(PlannerInfo *root, Var *var,
              * if the underlying column is unique, the subquery may have
              * joined to other tables in a way that creates duplicates.
              */
-            examine_simple_variable(rel->subroot, var, vardata);
+            examine_simple_variable(subroot, var, vardata);
         }
     }
     else
     {
         /*
-         * Otherwise, the Var comes from a FUNCTION, VALUES, or CTE RTE.  (We
-         * won't see RTE_JOIN here because join alias Vars have already been
+         * Otherwise, the Var comes from a FUNCTION or VALUES RTE.  (We won't
+         * see RTE_JOIN here because join alias Vars have already been
          * flattened.)    There's not much we can do with function outputs, but
-         * maybe someday try to be smarter about VALUES and/or CTEs.
+         * maybe someday try to be smarter about VALUES.
          */
     }
 }

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Wrong sentence in the README?
Next
From: David Christensen
Date:
Subject: Re: Moving forward with TDE [PATCH v3]