Thread: BUG #17709: Regression in PG15 with window functions - "WindowFunc not found in subplan target lists"

The following bug has been logged on the website:

Bug reference:      17709
Logged by:          Alexey Makhmutov
Email address:      makhmutov@gmail.com
PostgreSQL version: 15.1
Operating system:   Ubuntu 20.04
Description:

Following query works fine on PG14, but produce error "WindowFunc not found
in subplan target lists" on PG15:

select 1
from 
 (
  select count(case t1.a when 1 then 1 else null end) over (partition by
t2.b) c
  from  (select 1 a) t1, (select 1 b) t2
 ) t
where t.c = 1

On PG14 this query produce expected result (1), but on PG15.1 it produces
following error: "ERROR:  WindowFunc not found in subplan target lists"
As of 8 December 2022, this problem could be reproduced on latest
REL_15_STABLE and HEAD (16dev) builds.

This seems to be result of functionality
https://github.com/postgres/postgres/commit/9d9c02ccd1aea8e9131d8f4edb21bf1687e40782
introduced in PG15.



On Fri, Dec 9, 2022 at 5:42 PM PG Bug reporting form <noreply@postgresql.org> wrote:   
Following query works fine on PG14, but produce error "WindowFunc not found
in subplan target lists" on PG15:

select 1
from
 (
  select count(case t1.a when 1 then 1 else null end) over (partition by
t2.b) c
  from  (select 1 a) t1, (select 1 b) t2
 ) t
where t.c = 1
 
Thanks for the report!  I can reproduce this issue.

The WindowFunc within runCondition comes from the query's targetList,
before we pull up subquery 't1'.  Then when it comes to pulling up
subquery 't1', we perform pullup variable replacement for the query's
targetList but not for runCondition in the query's windowClause.  I
believe that's how this error is triggered.

Below is how we can fix this issue.

--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -2134,6 +2134,16 @@ perform_pullup_replace_vars(PlannerInfo *root,
         * can't contain any references to a subquery.
         */
    }
+   if (parse->windowClause)
+   {
+       foreach(lc, parse->windowClause)
+       {
+           WindowClause *wclause = (WindowClause *) lfirst(lc);
+
+           wclause->runCondition = (List *)
+               pullup_replace_vars((Node *) wclause->runCondition, rvcontext);
+       }
+   }
    if (parse->mergeActionList)
    {
        foreach(lc, parse->mergeActionList)

Thanks
Richard
On Sat, 10 Dec 2022 at 00:00, Richard Guo <guofenglinux@gmail.com> wrote:
> Thanks for the report!  I can reproduce this issue.
>
> The WindowFunc within runCondition comes from the query's targetList,
> before we pull up subquery 't1'.  Then when it comes to pulling up
> subquery 't1', we perform pullup variable replacement for the query's
> targetList but not for runCondition in the query's windowClause.  I
> believe that's how this error is triggered.
>
> Below is how we can fix this issue.
>
> --- a/src/backend/optimizer/prep/prepjointree.c
> +++ b/src/backend/optimizer/prep/prepjointree.c
> @@ -2134,6 +2134,16 @@ perform_pullup_replace_vars(PlannerInfo *root,
>          * can't contain any references to a subquery.
>          */
>     }
> +   if (parse->windowClause)
> +   {
> +       foreach(lc, parse->windowClause)
> +       {
> +           WindowClause *wclause = (WindowClause *) lfirst(lc);
> +
> +           wclause->runCondition = (List *)
> +               pullup_replace_vars((Node *) wclause->runCondition, rvcontext);
> +       }
> +   }

Thanks for having a look at this.

I think what you have above fixes the bulk of the issue, but there's
still a bit more to do to properly fix the reported case.

The additional thing that seems to cause the reported error is that
once the subquery is pulled up, the run condition also needs a round
of constant folding done. See subquery_planner() around line 827.  The
problem is that the target list's WindowFunc ends up with count(1)
over .., but the run condition's one is left as count(case 1 when 1
then 1 else null end), which preprocess_expression() will fold into
the same as what's in the target list.

I'm now wondering if WindowClause.runCondition should be of type Node
* instead of List *. I'd have imagined I should be passing the type of
EXPRKIND_QUAL to preprocess_expression's type, but canonicalize_qual()
does not like Lists.

David

Attachment

On Fri, Dec 9, 2022 at 7:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
The additional thing that seems to cause the reported error is that
once the subquery is pulled up, the run condition also needs a round
of constant folding done. See subquery_planner() around line 827.  The
problem is that the target list's WindowFunc ends up with count(1)
over .., but the run condition's one is left as count(case 1 when 1
then 1 else null end), which preprocess_expression() will fold into
the same as what's in the target list.
 
Yes exactly. That's what we also have to do.  I was debugging with a
simplified version of the query with the WindowFunc as count(t1.a) over
(...) and did not realize constant folding is also needed for the
runCondition.
 
I'm now wondering if WindowClause.runCondition should be of type Node
* instead of List *. I'd have imagined I should be passing the type of
EXPRKIND_QUAL to preprocess_expression's type, but canonicalize_qual()
does not like Lists.
 
I'm not sure about this.  From how the runCondition is constructed in
find_window_run_conditions, it seems there is no need to canonicalize
it.

Thanks
Richard
On Sat, 10 Dec 2022 at 03:01, Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Fri, Dec 9, 2022 at 7:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> The additional thing that seems to cause the reported error is that
>> once the subquery is pulled up, the run condition also needs a round
>> of constant folding done. See subquery_planner() around line 827.  The
>> problem is that the target list's WindowFunc ends up with count(1)
>> over .., but the run condition's one is left as count(case 1 when 1
>> then 1 else null end), which preprocess_expression() will fold into
>> the same as what's in the target list.
>
>
> Yes exactly. That's what we also have to do.  I was debugging with a
> simplified version of the query with the WindowFunc as count(t1.a) over
> (...) and did not realize constant folding is also needed for the
> runCondition.

I made a few small minor adjustments and pushed the patch.

Thanks for the report Alexey and to Richard for looking into this.

David