Thread: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18305 Logged by: Zuming Jiang Email address: zuming.jiang@inf.ethz.ch PostgreSQL version: 16.1 Operating system: Ubuntu 20.04 Description: My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected error "ERROR: WindowFunc not found in subplan target lists". --- Set up database --- create table exeet_t3 (pkey int4); create view exeet_t8 as select ntile(exeet_subq_0.c_0) over () as c_0 from (select (select pkey from exeet_t3 order by pkey limit 1 offset 6) as c_0) as exeet_subq_0; The fuzzer generates a test case: --- Test case --- select 1 as c_1 from exeet_t8 as exeet_ref_17 where exeet_ref_17.c_0 < 0; --- Expected behavior --- The test case should not trigger any error. --- Actual behavior --- The test case trigger an error: ERROR: WindowFunc not found in subplan target lists --- Postgres version --- Github commit: b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 Version: PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit --- Platform information --- Platform: Ubuntu 20.04 Kernel: Linux 5.4.0-147-generic
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected > error "ERROR: WindowFunc not found in subplan target lists". > create table exeet_t3 (pkey int4); > create view exeet_t8 as > select > ntile(exeet_subq_0.c_0) over () as c_0 > from > (select (select pkey from exeet_t3 order by pkey limit 1 offset 6) as > c_0) as exeet_subq_0; > select > 1 as c_1 > from > exeet_t8 as exeet_ref_17 > where exeet_ref_17.c_0 < 0; > ERROR: WindowFunc not found in subplan target lists Thanks for the report! Bisecting shows that this broke at 456fa635a909ee36f73ca84d340521bd730f265f is the first bad commit commit 456fa635a909ee36f73ca84d340521bd730f265f Author: David Rowley <drowley@postgresql.org> Date: Fri Jan 27 16:08:41 2023 +1300 Teach planner about more monotonic window functions 9d9c02ccd introduced runConditions for window functions to allow monotonic window function evaluation to be made more efficient when the window function value went beyond some value that it would never go back from due to its monotonic nature. That commit added prosupport functions to inform the planner that row_number(), rank(), dense_rank() and some forms of count(*) were monotonic. Here we add support for ntile(), cume_dist() and percent_rank(). So I'm betting there's something wrong with the ntile support function, but I've not dug deeper. regards, tom lane
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Richard Guo
Date:
On Mon, Jan 22, 2024 at 11:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected
> error "ERROR: WindowFunc not found in subplan target lists".
> create table exeet_t3 (pkey int4);
> create view exeet_t8 as
> select
> ntile(exeet_subq_0.c_0) over () as c_0
> from
> (select (select pkey from exeet_t3 order by pkey limit 1 offset 6) as
> c_0) as exeet_subq_0;
> select
> 1 as c_1
> from
> exeet_t8 as exeet_ref_17
> where exeet_ref_17.c_0 < 0;
> ERROR: WindowFunc not found in subplan target lists
Thanks for the report! Bisecting shows that this broke at
456fa635a909ee36f73ca84d340521bd730f265f is the first bad commit
commit 456fa635a909ee36f73ca84d340521bd730f265f
Author: David Rowley <drowley@postgresql.org>
Date: Fri Jan 27 16:08:41 2023 +1300
Teach planner about more monotonic window functions
9d9c02ccd introduced runConditions for window functions to allow
monotonic window function evaluation to be made more efficient when the
window function value went beyond some value that it would never go back
from due to its monotonic nature. That commit added prosupport functions
to inform the planner that row_number(), rank(), dense_rank() and some
forms of count(*) were monotonic. Here we add support for ntile(),
cume_dist() and percent_rank().
So I'm betting there's something wrong with the ntile support
function, but I've not dug deeper.
Here is a simplified repro query:
select 1 from
(select ntile(s1.x) over () as c
from (select (select 1) as x) as s1) s
where s.c = 1;
ERROR: WindowFunc not found in subplan target lists
I think the problem is that in find_window_run_conditions() we fail to
check thoroughly whether the WindowFunc contains any potential subplan
nodes. We do have the check:
/* can't use it if there are subplans in the WindowFunc */
if (contain_subplans((Node *) wfunc))
return false;
... and the bug being reported here indicates that the current check is
insufficient, because a Var node inside the WindowFunc could potentially
belong to a subquery and reference a SubLink expression.
In the query above, the WindowFunc's argument, 's1.x', is actually an
EXPR_SUBLINK sublink. After we've pulled up subquery 's1', the
arguments of the two WindowFuncs (one in the target list of subquery
's', the other in the WindowClause's runCondition of subquery 's') would
be replaced with SubLink nodes. And then after we've converted SubLink
nodes to SubPlans, these two SubLink nodes would be replaced with
Params, but with different paramids.
Hmm, is there a way to detect in find_window_run_conditions() whether
the WindowFunc's argument references a SubLink node of a subquery that
hasn't been pulled up?
Thanks
Richard
select 1 from
(select ntile(s1.x) over () as c
from (select (select 1) as x) as s1) s
where s.c = 1;
ERROR: WindowFunc not found in subplan target lists
I think the problem is that in find_window_run_conditions() we fail to
check thoroughly whether the WindowFunc contains any potential subplan
nodes. We do have the check:
/* can't use it if there are subplans in the WindowFunc */
if (contain_subplans((Node *) wfunc))
return false;
... and the bug being reported here indicates that the current check is
insufficient, because a Var node inside the WindowFunc could potentially
belong to a subquery and reference a SubLink expression.
In the query above, the WindowFunc's argument, 's1.x', is actually an
EXPR_SUBLINK sublink. After we've pulled up subquery 's1', the
arguments of the two WindowFuncs (one in the target list of subquery
's', the other in the WindowClause's runCondition of subquery 's') would
be replaced with SubLink nodes. And then after we've converted SubLink
nodes to SubPlans, these two SubLink nodes would be replaced with
Params, but with different paramids.
Hmm, is there a way to detect in find_window_run_conditions() whether
the WindowFunc's argument references a SubLink node of a subquery that
hasn't been pulled up?
Thanks
Richard
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Tom Lane
Date:
Richard Guo <guofenglinux@gmail.com> writes: > Hmm, is there a way to detect in find_window_run_conditions() whether > the WindowFunc's argument references a SubLink node of a subquery that > hasn't been pulled up? contain_subplans will recognize both SubLinks and SubPlans, so there must be some other factor here. (I didn't look yet.) regards, tom lane
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Tom Lane
Date:
Richard Guo <guofenglinux@gmail.com> writes: > I think the problem is that in find_window_run_conditions() we fail to > check thoroughly whether the WindowFunc contains any potential subplan > nodes. We do have the check: > /* can't use it if there are subplans in the WindowFunc */ > if (contain_subplans((Node *) wfunc)) > return false; > ... and the bug being reported here indicates that the current check is > insufficient, because a Var node inside the WindowFunc could potentially > belong to a subquery and reference a SubLink expression. I dug through this, and my conclusion is that the entire "run condition" optimization is seriously broken, because the tree manipulations here are many bricks shy of a load. There are two big problems: 1. The subquery hasn't yet been through any preprocessing. Thus, we've not yet done subquery pullup within it, which is why the contain_subplans call fails to detect a problem: the WindowFunc's arg is just a Var. Later, pullup of the lower subquery will replace that Var with a SubLink, which eventually gets replaced with a Param referencing an initplan's output. However, run condition optimization caused us to put a second copy of the WindowFunc into the runCondition, and that copy gets its own copy of the SubLink, which eventually results in a different Param, leading to the observed failure. (BTW, it scares the heck out of me that we just link the WindowFunc expr into wclause->runCondition without making a copy of it. That could be safe if the planner never scribbles on its input, but that ain't so.) 2. We are pushing the "otherexpr" from the upper-level query (which *has* been through preprocessing) into the unprocessed child query. This makes me acutely uncomfortable: A. Almost certainly, this fails if otherexpr contains a subplan. B. Since nothing is done about levelsup adjustment, it will fail if anything in that tree includes a levelsup field. Fortunately is_pseudo_constant_clause will reject Vars, but those aren't the only things with levelsup. I wonder if it's possible to get an uplevel Aggref into a subquery's restrictlist, in particular. C. Again, failure to make a copy of the expression tree seems pretty dangerous. D. We're relying on repeat preprocessing of the otherexpr to do no harm. Maybe that's OK, but it's not great. The first three things are not hard to fix, but they're not being taken care of today. We could work around point 1 by refusing to pull up subqueries that contain sublinks in their targetlists, but that would be a pretty big change (and, probably, a pessimization of some queries). I do not consider run-condition optimization to justify that. Moreover I'm not sure that sublinks are the only thing that could get mutated to a different state in the runCondition than in the main tree. I think the only real way to prevent problems from point 1 is to stop making a copy of the WindowFunc expr. We need some other way to refer to the WindowFunc's value in the runCondition tree. Maybe a generated Param would serve? regards, tom lane
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Wed, 24 Jan 2024 at 08:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: > We could work around point 1 by refusing to pull up subqueries that > contain sublinks in their targetlists, but that would be a pretty big > change (and, probably, a pessimization of some queries). I do not > consider run-condition optimization to justify that. Moreover > I'm not sure that sublinks are the only thing that could get > mutated to a different state in the runCondition than in the main > tree. > > I think the only real way to prevent problems from point 1 is to stop > making a copy of the WindowFunc expr. We need some other way to refer > to the WindowFunc's value in the runCondition tree. Maybe a generated > Param would serve? I'm wondering if it was wrong to put the runCondition field in WindowClause. Maybe it should be in WindowFunc instead... Really the OpExpr that's created in find_window_run_conditions() with the WindowFunc as an arg is there to show the Run Condition in EXPLAIN. This is what's later stored in WindowAgg.runConditionOrig. What we pass to ExecQual in nodeWindowAgg.c is a version of the OpExpr with the WindowFunc replaced with a Var so that ExecQual properly fetches the just-calculated WindowFunc return value from the slot. We don't want to leave the WindowFunc in the runCondition as ExecQual would go and evaluate it. If WindowFunc allowed a list of a new struct called WindowRunCondition with fields "otherarg", "opno", "collation", "wfunc_left" then we could construct the OpExpr later either in createplan.c or setrefs.c. The EXPLAIN version of that OpExpr could have the WindowFunc and the non-EXPLAIN version would have the Var. Sounds a bit invasive for back branches, but wondering if we couldn't just modify window_ntile_support() to reject any ntile args other than Consts. count(*), row_number(), rank(), dense_rank(), percent_rank() and percent_rank() all can't suffer from this issue as they don't have an argument. count(expr) would need to have something done to stop the same issue from occurring. Maybe int8inc_support() could just set req->monotonic = MONOTONICFUNC_NONE if the req->window_func has an arg, effectively disabling the optimisation for count(expr). David
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Thu, 25 Jan 2024 at 18:14, David Rowley <dgrowleyml@gmail.com> wrote: > > On Wed, 24 Jan 2024 at 08:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > We could work around point 1 by refusing to pull up subqueries that > > contain sublinks in their targetlists, but that would be a pretty big > > change (and, probably, a pessimization of some queries). I do not > > consider run-condition optimization to justify that. Moreover > > I'm not sure that sublinks are the only thing that could get > > mutated to a different state in the runCondition than in the main > > tree. > > > > I think the only real way to prevent problems from point 1 is to stop > > making a copy of the WindowFunc expr. We need some other way to refer > > to the WindowFunc's value in the runCondition tree. Maybe a generated > > Param would serve? > > I'm wondering if it was wrong to put the runCondition field in > WindowClause. Maybe it should be in WindowFunc instead... > > Really the OpExpr that's created in find_window_run_conditions() with > the WindowFunc as an arg is there to show the Run Condition in > EXPLAIN. This is what's later stored in WindowAgg.runConditionOrig. > What we pass to ExecQual in nodeWindowAgg.c is a version of the OpExpr > with the WindowFunc replaced with a Var so that ExecQual properly > fetches the just-calculated WindowFunc return value from the slot. We > don't want to leave the WindowFunc in the runCondition as ExecQual > would go and evaluate it. > > If WindowFunc allowed a list of a new struct called WindowRunCondition > with fields "otherarg", "opno", "collation", "wfunc_left" then we > could construct the OpExpr later either in createplan.c or setrefs.c. > The EXPLAIN version of that OpExpr could have the WindowFunc and the > non-EXPLAIN version would have the Var. Just to assist the discussion here I've drafted a patch along the lines of the above. See attached If you think this idea has merit I can try and turn it into something committable for master. > Sounds a bit invasive for back branches, but wondering if we couldn't > just modify window_ntile_support() to reject any ntile args other than > Consts. count(*), row_number(), rank(), dense_rank(), percent_rank() > and percent_rank() all can't suffer from this issue as they don't have > an argument. count(expr) would need to have something done to stop > the same issue from occurring. Maybe int8inc_support() could just set > req->monotonic = MONOTONICFUNC_NONE if the req->window_func has an > arg, effectively disabling the optimisation for count(expr). I'm still unsure about the fix for back branches but I'm open to other ideas. David
Attachment
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Richard Guo
Date:
On Fri, Jan 26, 2024 at 8:02 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 25 Jan 2024 at 18:14, David Rowley <dgrowleyml@gmail.com> wrote:
> If WindowFunc allowed a list of a new struct called WindowRunCondition
> with fields "otherarg", "opno", "collation", "wfunc_left" then we
> could construct the OpExpr later either in createplan.c or setrefs.c.
> The EXPLAIN version of that OpExpr could have the WindowFunc and the
> non-EXPLAIN version would have the Var.
Just to assist the discussion here I've drafted a patch along the
lines of the above. See attached
If you think this idea has merit I can try and turn it into something
committable for master.
This idea seems reasonable to me. Now the runCondition is constructed
in create_one_window_path(), where the subquery has been through
preprocessing and therefore the WindowFunc's arg has been replaced with
a Param due to the pullup of the lower subquery and the expansion of
SubLinks to SubPlans. This fixes the problem reported here.
Thanks
Richard
in create_one_window_path(), where the subquery has been through
preprocessing and therefore the WindowFunc's arg has been replaced with
a Param due to the pullup of the lower subquery and the expansion of
SubLinks to SubPlans. This fixes the problem reported here.
Thanks
Richard
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Richard Guo
Date:
On Thu, Jan 25, 2024 at 1:14 PM David Rowley <dgrowleyml@gmail.com> wrote:
Sounds a bit invasive for back branches, but wondering if we couldn't
just modify window_ntile_support() to reject any ntile args other than
Consts. count(*), row_number(), rank(), dense_rank(), percent_rank()
and percent_rank() all can't suffer from this issue as they don't have
an argument. count(expr) would need to have something done to stop
the same issue from occurring. Maybe int8inc_support() could just set
req->monotonic = MONOTONICFUNC_NONE if the req->window_func has an
arg, effectively disabling the optimisation for count(expr).
You're right that count(expr) also suffers from this issue.
select 1 from
(select count(s1.x) over () as c
from (select (select 1) as x) as s1) s
where s.c = 1;
ERROR: WindowFunc not found in subplan target lists
For back branches, the idea of modifying window_ntile_support() and
int8inc_support() to reject any non-pseudoconstant args also seems
reasonable to me. One thing I noticed is that sometimes it's not easy
to tell whether the arg is pseudoconstant or not in the support
functions, because a pseudoconstant is not necessarily being type of
Const. For instance, count(1::text) is a CoerceViaIO, and
ntile(1.0::int) is a FuncExpr. But these are very corner cases and I
think we can just ignore them.
Thanks
Richard
select 1 from
(select count(s1.x) over () as c
from (select (select 1) as x) as s1) s
where s.c = 1;
ERROR: WindowFunc not found in subplan target lists
For back branches, the idea of modifying window_ntile_support() and
int8inc_support() to reject any non-pseudoconstant args also seems
reasonable to me. One thing I noticed is that sometimes it's not easy
to tell whether the arg is pseudoconstant or not in the support
functions, because a pseudoconstant is not necessarily being type of
Const. For instance, count(1::text) is a CoerceViaIO, and
ntile(1.0::int) is a FuncExpr. But these are very corner cases and I
think we can just ignore them.
Thanks
Richard
Re: Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Mon, 1 Apr 2024 at 21:34, akuluasan <akuluasan@163.com> wrote: > I am using my tool to simplify the SQL query. Can you please confirm if the simplification process helps you diagnoseand locate bugs? The root cause of this has already been established, so in this case, the answer is "no". However, since you wrote "bugs" rather than "this bug", for cases where we've not found the root cause, the more simple the reproducer, the better. David
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Fri, 22 Mar 2024 at 23:47, Richard Guo <guofenglinux@gmail.com> wrote: > For back branches, the idea of modifying window_ntile_support() and > int8inc_support() to reject any non-pseudoconstant args also seems > reasonable to me. One thing I noticed is that sometimes it's not easy > to tell whether the arg is pseudoconstant or not in the support > functions, because a pseudoconstant is not necessarily being type of > Const. For instance, count(1::text) is a CoerceViaIO, and > ntile(1.0::int) is a FuncExpr. But these are very corner cases and I > think we can just ignore them. I don't think that needs anything special aside from constant folding. I've attached a more complete version of the patch (0002) and another patch which is what I'd proposed as a fix for the backbranches (0001). Note quite a few tests needed to be adjusted because of disabling this optimisation. The 0002 patch will require a cat version bump as it adds a field to WindowFunc. Ideally, I'd be applying this fix to master, but I imagine some people might feel we should delay applying a fix like this until after we branch for v18. Happy to hear people's views on that. David
Attachment
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > I've attached a more complete version of the patch (0002) and another > patch which is what I'd proposed as a fix for the backbranches (0001). > Note quite a few tests needed to be adjusted because of disabling this > optimisation. IIUC, 0002 is meant to be applied on top of 0001, but it reverses quite a lot of 0001? Please don't commit it like that, it'll just add a lot of useless thrashing to "git blame" results. It'd be easier to review this if you presented it as two independent patches, one for HEAD and one for the back branches. The fact that you had to use a cheesy "eval_const_expressions(NULL, ..." call in 0001 demonstrates that it was a mistake to not include PlannerInfo in SupportRequestWFuncMonotonic, as every other planner- invoked support request has. I realize that we can't change that in back branches, and that it's no longer immediately necessary in HEAD either after 0002. But let's learn from experience and add it to the struct while we're here. > The 0002 patch will require a cat version bump as it adds a field to > WindowFunc. Ideally, I'd be applying this fix to master, but I > imagine some people might feel we should delay applying a fix like > this until after we branch for v18. Happy to hear people's views on > that. catversion bumps are not a problem at this stage --- we will surely do at least one more for assigned-OID renumbering. I'd rather avoid the git history thrashing implicit in treating 0002 as a follow-on patch. regards, tom lane
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Fri, 26 Apr 2024 at 03:39, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It'd be easier to review this if you presented it as two independent > patches, one for HEAD and one for the back branches. The attached v2 is the same patch as earlier and is intended for <= v16. > The fact that you had to use a cheesy "eval_const_expressions(NULL, > ..." call in 0001 demonstrates that it was a mistake to not include > PlannerInfo in SupportRequestWFuncMonotonic, as every other planner- > invoked support request has. I realize that we can't change that > in back branches, and that it's no longer immediately necessary > in HEAD either after 0002. But let's learn from experience and > add it to the struct while we're here. The correct PlannerInfo to set here would be the one that the WindowFunc belongs to. The problem is that this code is called from set_subquery_pathlist before the rel->subroot = subquery_planner. i.e we've no PlannerInfo to set. Maybe it's worth doing this for SupportRequestOptimizeWindowClause as a separate patch. > catversion bumps are not a problem at this stage The attached v3 is a separate patch for v17 only. David
Attachment
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Tue, 30 Apr 2024 at 09:39, David Rowley <dgrowleyml@gmail.com> wrote: > The attached v2 is the same patch as earlier and is intended for <= v16. I want to commit this one so it makes it into the next minor version releases. It's not very complex, so plan to push it later today. I'll hold off with the v17 version for a while. David
Re: BUG #18305: Unexpected error: "WindowFunc not found in subplan target lists" triggered by subqueries
From
David Rowley
Date:
On Wed, 1 May 2024 at 11:30, David Rowley <dgrowleyml@gmail.com> wrote: > I'll hold off with the v17 version for a while. I've now pushed the fix for v17. David