Re: BUG #17479: "plan should not reference subplan's variable" when calling `grouping` on result of subquery - Mailing list pgsql-bugs

From Richard Guo
Subject Re: BUG #17479: "plan should not reference subplan's variable" when calling `grouping` on result of subquery
Date
Msg-id CAMbWs48m-p46J7-mWn3xZwstbo9HCq9oEom1usbKYb7EeGnTog@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17479: "plan should not reference subplan's variable" when calling `grouping` on result of subquery  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: BUG #17479: "plan should not reference subplan's variable" when calling `grouping` on result of subquery
List pgsql-bugs

On Tue, May 10, 2022 at 6:07 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Tue, May 10, 2022 at 2:12 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17479
Logged by:          Michael J. Sullivan
Email address:      sully@msully.net
PostgreSQL version: 14.2
Operating system:   Linux
Description:       

The following query produces "plan should not reference subplan's
variable"

create table Card (id uuid);

SELECT
    -- This line causes "variable not found in subplan target list"
    -- grouping(res.cnt)
    -- This line causes "plan should not reference subplan's variable"
   (SELECT grouping(res.cnt))
FROM Card
CROSS JOIN LATERAL
(SELECT
    (SELECT Card.id) AS cnt
) AS res
GROUP BY
    res.cnt

As the comment says, a slight change instead errors with "variable not found
in subplan target list".

Reproduced this issue on HEAD:

# explain (verbose, costs off)
SELECT grouping(res.cnt) FROM Card CROSS JOIN LATERAL (SELECT (SELECT Card.id) AS cnt) AS res GROUP BY res.cnt;
ERROR:  variable not found in subplan target list

For this query, initially it has two TargetEntrys and both referencing
the RangeTblEntry of the subquery.

   {TARGETENTRY
   :expr
      {GROUPINGFUNC
      :args (
         {VAR
         :varno 2
         :varattno 1


AND

   {TARGETENTRY
   :expr
      {VAR
      :varno 2
      :varattno 1

More specifically, they are both referencing the first TargetEntry from
the subquery. And the first TargetEntry of the subquery is of type
EXPR SubLink. So after we pull up this subquery, the two TargetEntrys
become:

   {TARGETENTRY
   :expr
      {GROUPINGFUNC
      :args (
         {SUBLINK
         :subLinkType 4
         :subLinkId 0

AND

   {TARGETENTRY
   :expr
      {SUBLINK
      :subLinkType 4
      :subLinkId 0


Actually the two SubLink expressions are totally the same. But we did
not check that and proceeded to expand them to two SubPlans.

  {TARGETENTRY
   :expr
      {GROUPINGFUNC
      :args (
         {SUBPLAN
         :subLinkType 4
         :testexpr <>
         :paramIds <>
         :plan_id 1
         :plan_name SubPlan\ 1

AND

   {TARGETENTRY
   :expr
      {SUBPLAN
      :subLinkType 4
      :testexpr <>
      :paramIds <>
      :plan_id 2
      :plan_name SubPlan\ 2

The two SubPlans are assigned with different plan_ids/plan_names.
That's why when we fix up the GROUPINGFUNC target entry we failed to
match the whole SubPlan expression, i.e. we failed to match 'SubPlan 1'
against 'SubPlan 2'.

When we generate PathTarget for initial input to grouping nodes in
make_group_input_target(), for non-grouping columns we would pull out
all the Vars they mention with the help of pull_var_clause(), and add
them to the input target. But this ignores the Vars included inside
GroupingFunc node, even with flag PVC_RECURSE_AGGREGATES.

If we change that and include the Vars inside GroupingFunc node, we
would be able to fix up the GROUPINGFUNC target entry correctly by
walking into the GroupingFunc->args and matching the Vars there.

--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -661,13 +661,7 @@ pull_var_clause_walker(Node *node, pull_var_clause_context *context)
                }
                else if (context->flags & PVC_RECURSE_AGGREGATES)
                {
-                       /*
-                        * We do NOT descend into the contained expression, even if the
-                        * caller asked for it, because we never actually evaluate it -
-                        * the result is driven entirely off the associated GROUP BY
-                        * clause, so we never need to extract the actual Vars here.
-                        */
-                       return false;
+                       /* fall through to recurse into the GroupingFunc's arguments */
                }
                else
                        elog(ERROR, "GROUPING found where not expected");

Can we change that to fix this issue?

Thanks
Richard

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #17480: Assertion failure in parse_relation.c
Next
From: Stefan Pastrilov
Date:
Subject: Re: BUG #17478: Missing documents in the index after CREATE INDEX CONCURRENTLY (but existing in the table)