Thread: BUG #17502: View based on window functions returns wrong results when queried

BUG #17502: View based on window functions returns wrong results when queried

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17502
Logged by:          Daniel Farkaš
Email address:      daniel.farkas@datoris.com
PostgreSQL version: 10.10
Operating system:   Linux
Description:

Hey,

Please be gentle, I've never been in contact with Postgres developers.
In short, I've created a view, which has rather sketchy window functions,
but it gives me results I need.
When I do select * on it, it gives me what I expect. One of the columns has
five distinct values.
But when I do group by on that column, it gives me only one of the values.
When I drop the view and create materialized view, all is good, I get all
five values.

My guess is that some parts of the inner select are affecting outer, view's
select, which is not something I would expect.
My current Postgres is PostgreSQL 10.10 on x86_64-pc-linux-musl, compiled by
gcc (Alpine 8.3.0) 8.3.0, 64-bit.
If you think this is worth investigating further, I will try composing a
simpler example, and test it in a more recent Postgres version.
Maybe it's a known limitation I'm not aware of.

Let me know what you think.

Daniel



On Sun, May 29, 2022 at 4:20 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17502
Logged by:          Daniel Farkaš
Email address:      daniel.farkas@datoris.com
PostgreSQL version: 10.10
Operating system:   Linux
Description:       

Hey,

Please be gentle, I've never been in contact with Postgres developers.
In short, I've created a view, which has rather sketchy window functions,
but it gives me results I need.
When I do select * on it, it gives me what I expect. One of the columns has
five distinct values.
But when I do group by on that column, it gives me only one of the values.
When I drop the view and create materialized view, all is good, I get all
five values.

My guess is that some parts of the inner select are affecting outer, view's
select, which is not something I would expect.
My current Postgres is PostgreSQL 10.10 on x86_64-pc-linux-musl, compiled by
gcc (Alpine 8.3.0) 8.3.0, 64-bit.
If you think this is worth investigating further, I will try composing a
simpler example, and test it in a more recent Postgres version.
Maybe it's a known limitation I'm not aware of.

Let me know what you think.


Please see if you can reproduce this on a current version of PostgreSQL 10, which is 10.21. Version 10.10 is lacking more than two and a half years worth of bugfixes.

If you can then yes, try to put together a simpler example, because it certainly does not sound like correct behavior.

//Magnus

Hey,

Thanks for the response.
I was able to replicate in 10.21.

Here is how:

SELECT version();
PostgreSQL 10.21 on x86_64-pc-linux-musl, compiled by gcc (Alpine 11.2.1_git20220219) 11.2.1 20220219, 64-bit

CREATE TABLE analytics_table (dimension_1 VARCHAR, dimension_2 VARCHAR, metric_1 VARCHAR, metric_2 VARCHAR);

INSERT INTO analytics_table VALUES ('a1', 'b1', 'c1', 'd1'), ('a1', 'b2', 'c2', 'd2');

SELECT * FROM analytics_table;
dimension_1|dimension_2|metric_1|metric_2|
-----------+-----------+--------+--------+
a1         |b1         |c1      |d1      |
a1         |b2         |c2      |d2      |

SELECT
dimension_1,
dimension_2,
row_number() OVER(partition by generate_series(1,2) order by dimension_1, dimension_2) AS rownum,
row_number() OVER(partition by dimension_1, dimension_2) AS metricnum
FROM analytics_table ORDER BY dimension_1, dimension_2;
dimension_1|dimension_2|rownum|metricnum|
-----------+-----------+------+------+
a1         |b1         |     1|     1|
a1         |b1         |     1|     2|
a1         |b2         |     2|     1|
a1         |b2         |     2|     2|

CREATE VIEW analytics_view AS
SELECT
dimension_1,
dimension_2,
row_number() OVER(partition by generate_series(1,2) order by dimension_1, dimension_2) AS rownum,
CASE WHEN row_number() OVER(partition by dimension_1, dimension_2) = 1 
THEN 'metric_1' ELSE 'metric_2' END AS metric_name,
CASE WHEN row_number() OVER(partition by dimension_1, dimension_2) = 1 THEN metric_1 ELSE metric_2 END AS metric_value
FROM analytics_table ORDER BY dimension_1, dimension_2;

SELECT * FROM analytics_view;
dimension_1|dimension_2|rownum|metric_name|metric_value|
-----------+-----------+------+-----------+------------+
a1         |b1         |     1|metric_1   |c1          |
a1         |b1         |     1|metric_2   |d1          |
a1         |b2         |     2|metric_1   |c2          |
a1         |b2         |     2|metric_2   |d2          |

SELECT metric_name FROM analytics_view;
metric_name|
-----------+
metric_1   |
metric_1   |
metric_1   |
metric_1   |


CREATE MATERIALIZED VIEW analytics_materialized_view AS
SELECT
dimension_1,
dimension_2,
row_number() OVER(partition by generate_series(1,2) order by dimension_1, dimension_2) AS rownum,
CASE WHEN row_number() OVER(partition by dimension_1, dimension_2) = 1 THEN 'metric_1' ELSE 'metric_2' END AS metric_name,
CASE WHEN row_number() OVER(partition by dimension_1, dimension_2) = 1 THEN metric_1 ELSE metric_2 END AS metric_value
FROM analytics_table ORDER BY dimension_1, dimension_2;

SELECT * FROM analytics_materialized_view;
dimension_1|dimension_2|rownum|metric_name|metric_value|
-----------+-----------+------+-----------+------------+
a1         |b1         |     1|metric_1   |c1          |
a1         |b1         |     1|metric_2   |d1          |
a1         |b2         |     2|metric_1   |c2          |
a1         |b2         |     2|metric_2   |d2          |

SELECT metric_name FROM analytics_materialized_view;
metric_name|
-----------+
metric_1   |
metric_2   |
metric_1   |
metric_2   |


For some analytics purposes I needed to transform one wide table with multiple metrics into a metric_name/metric_value pairs stored as separate rows. That's why I did all this. I guess the reason and the method are not important, the fact that the view gives different results does look like a bug.

Cheers,
Daniel Farkas
Datoris

On Sun, May 29, 2022 at 4:22 PM Magnus Hagander <magnus@hagander.net> wrote:

On Sun, May 29, 2022 at 4:20 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17502
Logged by:          Daniel Farkaš
Email address:      daniel.farkas@datoris.com
PostgreSQL version: 10.10
Operating system:   Linux
Description:       

Hey,

Please be gentle, I've never been in contact with Postgres developers.
In short, I've created a view, which has rather sketchy window functions,
but it gives me results I need.
When I do select * on it, it gives me what I expect. One of the columns has
five distinct values.
But when I do group by on that column, it gives me only one of the values.
When I drop the view and create materialized view, all is good, I get all
five values.

My guess is that some parts of the inner select are affecting outer, view's
select, which is not something I would expect.
My current Postgres is PostgreSQL 10.10 on x86_64-pc-linux-musl, compiled by
gcc (Alpine 8.3.0) 8.3.0, 64-bit.
If you think this is worth investigating further, I will try composing a
simpler example, and test it in a more recent Postgres version.
Maybe it's a known limitation I'm not aware of.

Let me know what you think.


Please see if you can reproduce this on a current version of PostgreSQL 10, which is 10.21. Version 10.10 is lacking more than two and a half years worth of bugfixes.

If you can then yes, try to put together a simpler example, because it certainly does not sound like correct behavior.

//Magnus

On Mon, 30 May 2022 at 06:58, Daniel Farkaš <daniel.farkas@datoris.com> wrote:
> SELECT metric_name FROM analytics_view;
> metric_name|
> -----------+
> metric_1   |
> metric_1   |
> metric_1   |
> metric_1   |

This is certainly a bug. Thanks for reporting it.

The problem seems to be down to the fact that
remove_unused_subquery_outputs() does not check if the to-be-removed
target entry references WindowClauses which contain set-returning
functions.

We only seem to check if the target entry itself is an SRF, per:

/*
* If it contains a set-returning function, we can't remove it since
* that could change the number of rows returned by the subquery.
*/
if (subquery->hasTargetSRFs &&
expression_returns_set(texpr))
continue;

This ensures queries such as the following don't have SRF columns removed:

postgres=# explain verbose select a from (select
a,generate_series(1,2) as b from t) t;
                                 QUERY PLAN
----------------------------------------------------------------------------
 Subquery Scan on t  (cost=0.00..131.13 rows=5100 width=4)
   Output: t.a
   ->  ProjectSet  (cost=0.00..80.13 rows=5100 width=8)
         Output: t_1.a, generate_series(1, 2)
         ->  Seq Scan on public.t t_1  (cost=0.00..35.50 rows=2550 width=4)
               Output: t_1.a
(6 rows)

I'm a little bit uncertain if the correct fix is to have
expression_returns_set() look deeper into WindowFuncs to check if the
WindowClause that the function belongs to has any SRFs in the
PARTITION BY / ORDER BY clause.  Unfortunately, doing that means
having to pass the PlannerInfo to expression_returns_set(). I don't
quite see how that could be made to work in the back branches.

The other fix would be to make remove_unused_subquery_outputs() pull
out all WindowFuncs from the texpr and check if any of the
WindowClauses have SRFs in the PARTITION BY / ORDER BY clause.

I'll need to look a bit deeper into the usages of
expression_returns_set() to know which of the fixes is correct.  There
might be some other bugs lurking due to expression_returns_set() not
checking WindowClauses for SRFs.

David



On Mon, 30 May 2022 at 10:30, David Rowley <dgrowleyml@gmail.com> wrote:
> I'm a little bit uncertain if the correct fix is to have
> expression_returns_set() look deeper into WindowFuncs to check if the
> WindowClause that the function belongs to has any SRFs in the
> PARTITION BY / ORDER BY clause.  Unfortunately, doing that means
> having to pass the PlannerInfo to expression_returns_set(). I don't
> quite see how that could be made to work in the back branches.

There does seem to be other weird anomalies as a result of
expression_returns_set() not looking for SRFs in the WindowFunc
PARTITION BY / ORDER BY clause.

Looking at the usage of expression_returns_set() in
make_sort_input_target(), I'd have expected the following 2 queries to
act in the same way:

setup:
create table ab (a int, b int);
insert into ab values(1,1),(1,2),(2,1),(2,2);

postgres=# select distinct on (a) a,b, row_number() over (order by
generate_Series(1,2)) from ab order by a,b;
 a | b | row_number
---+---+------------
 1 | 1 |          1
 2 | 1 |          7
(2 rows)

postgres=# select distinct on (a) a,b, generate_Series(1,2) from ab
order by a,b;
 a | b | generate_series
---+---+-----------------
 1 | 1 |               1
 1 | 1 |               2
 2 | 1 |               1
 2 | 1 |               2
(4 rows)

The fact that those are different is starting to make me think that
expression_returns_set() should be looking into WindowFuncs to see if
the WindowClause has SRFs rather than adding special code to
remove_unused_subquery_outputs() in order to fix the bug being
reported on this thread.

However, I'm not sure the latter of the above two queries result makes
any sense given that our documentation [1] claims:

"PostgreSQL's behavior for a set-returning function in a query's
select list is almost exactly the same as if the set-returning
function had been written in a LATERAL FROM-clause item instead. For
example,"

the "almost" is later defined as:

"It would be exactly the same, except that in this specific example,
the planner could choose to put g on the outside of the nested-loop
join, since g has no actual lateral dependency on tab."

which only indicates that the order of the results may differ, not the
actual results themselves.

You can see the LATERAL version only returns 2 rows:

postgres=# select distinct on (a) a, b, g.s from ab, lateral
generate_Series(1,2) g(s) order by a,b;
 a | b | s
---+---+---
 1 | 1 | 1
 2 | 1 | 1
(2 rows)

The following is also pretty strange. Why should adding the SRF column
to the ORDER BY change the number of output rows?

postgres=# select distinct on (a) a,b, generate_Series(1,2) from ab
order by a,b,3 desc;
 a | b | generate_series
---+---+-----------------
 1 | 1 |               2
 2 | 1 |               2
(2 rows)


postgres=# select distinct on (a) a,b, generate_Series(1,2) from ab
order by a,b;
 a | b | generate_series
---+---+-----------------
 1 | 1 |               1
 1 | 1 |               2
 2 | 1 |               1
 2 | 1 |               2
(4 rows)

The LATERAL version of this will return 2 rows regardless of what the
ORDER BY says:

postgres=# select distinct on (a) a, b, g.s from ab, lateral
generate_Series(1,2) g(s) order by a,b,g.s DESC;
 a | b | s
---+---+---
 1 | 1 | 2
 2 | 1 | 2
(2 rows)


postgres=# select distinct on (a) a, b, g.s from ab, lateral
generate_Series(1,2) g(s) order by a,b;
 a | b | s
---+---+---
 1 | 1 | 1
 2 | 1 | 1
(2 rows)

In addition to what the documentation claims about SRFs in the target
list being the same as LATERAL, maybe the following should give us
some guidance on if the non-lateral SRF queries above should return 2
or 4 rows:

postgres=# select distinct generate_Series(0,1)/2;
 ?column?
----------
        0
(1 row)

In the above, the ProjectSet occurs before the distinctification. That
would translate into the 2-row version of the DISTINCT ON query above
being correct.

I think that roughly translates into changes being required in both
make_sort_input_target() and expression_returns_set().

Does anyone have any other opinions?

David

[1] https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET



David Rowley <dgrowleyml@gmail.com> writes:
> The problem seems to be down to the fact that
> remove_unused_subquery_outputs() does not check if the to-be-removed
> target entry references WindowClauses which contain set-returning
> functions.

I was sort of wondering why we allow SRFs in this context in the
first place.  The results don't seem terribly well-defined to me.
In particular, a WindowFunc invocation is not supposed to change the
number of rows in the query result, and yet this one is doing so.

            regards, tom lane



On Mon, 30 May 2022 at 14:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > The problem seems to be down to the fact that
> > remove_unused_subquery_outputs() does not check if the to-be-removed
> > target entry references WindowClauses which contain set-returning
> > functions.
>
> I was sort of wondering why we allow SRFs in this context in the
> first place.  The results don't seem terribly well-defined to me.
> In particular, a WindowFunc invocation is not supposed to change the
> number of rows in the query result, and yet this one is doing so.

That would certainly be an easier fix for the reported problem.

Do you think it would fly to add such a restriction in the backbranches?

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Mon, 30 May 2022 at 14:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I was sort of wondering why we allow SRFs in this context in the
>> first place.  The results don't seem terribly well-defined to me.

> Do you think it would fly to add such a restriction in the backbranches?

We could leave it alone in the back branches on the grounds that if
you like the results you get, we shouldn't break it in a minor
release.

            regards, tom lane



On Mon, 30 May 2022 at 14:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > On Mon, 30 May 2022 at 14:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I was sort of wondering why we allow SRFs in this context in the
> >> first place.  The results don't seem terribly well-defined to me.
>
> > Do you think it would fly to add such a restriction in the backbranches?
>
> We could leave it alone in the back branches on the grounds that if
> you like the results you get, we shouldn't break it in a minor
> release.

I struggle to see how anyone would like their result correctness to
depend on whether remove_unused_subquery_outputs() is able or unable
to remove a column from the subquery.

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Mon, 30 May 2022 at 14:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We could leave it alone in the back branches on the grounds that if
>> you like the results you get, we shouldn't break it in a minor
>> release.

> I struggle to see how anyone would like their result correctness to
> depend on whether remove_unused_subquery_outputs() is able or unable
> to remove a column from the subquery.

Indeed, it's unlikely that anybody would like these particular
results.  But perhaps somebody is running an application that does
this and happens to not trip over any obviously-wrong case; for
example, if one always selects all columns from this view, the
issue is not apparent.

I'm not necessarily against adding the prohibition in the back
branches.  However, if this has been wrong since 10.x (if not
further back) then it seems like few people are tripping over
the inconsistency.

            regards, tom lane



On Mon, 30 May 2022 at 15:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not necessarily against adding the prohibition in the back
> branches.  However, if this has been wrong since 10.x (if not
> further back) then it seems like few people are tripping over
> the inconsistency.

Yeah, perhaps. But we have received a report from 1 person and I think
it seems strange to adopt a policy that we require multiple bug
reports for the same issue before we consider fixing.

FWIW, just to demonstrate what a fix could look like, I've attached a
patch. I'm not planning to commit it if you really think we shouldn't
be fixing it.

The patch also does nothing for the weirdness that I described in the
DISTINCT ON case earlier in this thread.

David

Attachment

On Mon, May 30, 2022 at 9:12 AM David Rowley <dgrowleyml@gmail.com> wrote:
The following is also pretty strange. Why should adding the SRF column
to the ORDER BY change the number of output rows?

postgres=# select distinct on (a) a,b, generate_Series(1,2) from ab
order by a,b,3 desc;
 a | b | generate_series
---+---+-----------------
 1 | 1 |               2
 2 | 1 |               2
(2 rows)


postgres=# select distinct on (a) a,b, generate_Series(1,2) from ab
order by a,b;
 a | b | generate_series
---+---+-----------------
 1 | 1 |               1
 1 | 1 |               2
 2 | 1 |               1
 2 | 1 |               2
(4 rows)

This is indeed weird. This seems depends on at which plan level we add
ProjectSetPath. For the first query we insert ProjectSetPath at the
scan/join level, because the SRF is included in the ORDER BY, which
makes the SRF appear in the scan/join target. For the second query the
SRF is postponed to after the Sort.

Is this a bug we should fix?

Thanks
Richard
On Sun, May 29, 2022 at 10:18:59PM -0400, Tom Lane wrote:
> David Rowley <dgrowleyml@gmail.com> writes:
> > The problem seems to be down to the fact that
> > remove_unused_subquery_outputs() does not check if the to-be-removed
> > target entry references WindowClauses which contain set-returning
> > functions.
> 
> I was sort of wondering why we allow SRFs in this context in the
> first place.  The results don't seem terribly well-defined to me.
> In particular, a WindowFunc invocation is not supposed to change the
> number of rows in the query result, and yet this one is doing so.
 
 I  would like to note, that some SRFs may return only one row in
specific cases, like dblink() with the specific query.
 It may used much more often, than real multirow SRFs in PARTITION BY.

 While converting it to a LATERAL JOIN is not  a lot  of work  --
breaking it in previous versions may affect a lot of people.




Richard Guo <guofenglinux@gmail.com> writes:
> On Mon, May 30, 2022 at 9:12 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> The following is also pretty strange. Why should adding the SRF column
>> to the ORDER BY change the number of output rows?

> Is this a bug we should fix?

This bug seems to have slipped off the radar screen, but it's still
a bug.  I continue to believe that the best fix is to disallow SRFs
in window definitions, and present the trivial patch to do so.

As discussed, I'm comfortable with leaving this alone in the back
branches.

            regards, tom lane

diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index ca14f06308..a13f28615b 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2570,9 +2570,6 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
             break;
         case EXPR_KIND_WINDOW_PARTITION:
         case EXPR_KIND_WINDOW_ORDER:
-            /* okay, these are effectively GROUP BY/ORDER BY */
-            pstate->p_hasTargetSRFs = true;
-            break;
         case EXPR_KIND_WINDOW_FRAME_RANGE:
         case EXPR_KIND_WINDOW_FRAME_ROWS:
         case EXPR_KIND_WINDOW_FRAME_GROUPS:
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index d47b5f6ec5..600652581e 100644
--- a/src/test/regress/expected/tsrf.out
+++ b/src/test/regress/expected/tsrf.out
@@ -265,7 +265,21 @@ ERROR:  window function calls cannot contain set-returning function calls
 LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few;
                    ^
 HINT:  You might be able to move the set-returning function into a LATERAL FROM item.
--- SRFs are normally computed after window functions
+--- ... nor in window definitions
+SELECT sum(id) OVER (PARTITION BY generate_series(1, 3)) FROM few;
+ERROR:  set-returning functions are not allowed in window definitions
+LINE 1: SELECT sum(id) OVER (PARTITION BY generate_series(1, 3)) FRO...
+                                          ^
+SELECT sum(id) OVER (ORDER BY generate_series(1, 3)) FROM few;
+ERROR:  set-returning functions are not allowed in window definitions
+LINE 1: SELECT sum(id) OVER (ORDER BY generate_series(1, 3)) FROM fe...
+                                      ^
+SELECT sum(id) OVER (ROWS BETWEEN UNBOUNDED PRECEDING
+                     AND generate_series(1, 3) FOLLOWING) FROM few;
+ERROR:  set-returning functions are not allowed in window definitions
+LINE 2:                      AND generate_series(1, 3) FOLLOWING) FR...
+                                 ^
+-- SRFs are computed after window functions
 SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
  id | lag | count | generate_series
 ----+-----+-------+-----------------
@@ -280,15 +294,6 @@ SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
   3 |   2 |     3 |               3
 (9 rows)

--- unless referencing SRFs
-SELECT SUM(count(*)) OVER(PARTITION BY generate_series(1,3) ORDER BY generate_series(1,3)), generate_series(1,3) g
FROMfew GROUP BY g; 
- sum | g
------+---
-   3 | 1
-   3 | 2
-   3 | 3
-(3 rows)
-
 -- sorting + grouping
 SELECT few.dataa, count(*), min(id), max(id), generate_series(1,3) FROM few GROUP BY few.dataa ORDER BY 5, 1;
  dataa | count | min | max | generate_series
diff --git a/src/test/regress/sql/tsrf.sql b/src/test/regress/sql/tsrf.sql
index 7c22529a0d..8442ba9e74 100644
--- a/src/test/regress/sql/tsrf.sql
+++ b/src/test/regress/sql/tsrf.sql
@@ -82,10 +82,14 @@ SELECT sum((3 = ANY(SELECT lag(x) over(order by x)
 -- SRFs are not allowed in window function arguments, either
 SELECT min(generate_series(1, 3)) OVER() FROM few;

--- SRFs are normally computed after window functions
+--- ... nor in window definitions
+SELECT sum(id) OVER (PARTITION BY generate_series(1, 3)) FROM few;
+SELECT sum(id) OVER (ORDER BY generate_series(1, 3)) FROM few;
+SELECT sum(id) OVER (ROWS BETWEEN UNBOUNDED PRECEDING
+                     AND generate_series(1, 3) FOLLOWING) FROM few;
+
+-- SRFs are computed after window functions
 SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
--- unless referencing SRFs
-SELECT SUM(count(*)) OVER(PARTITION BY generate_series(1,3) ORDER BY generate_series(1,3)), generate_series(1,3) g
FROMfew GROUP BY g; 

 -- sorting + grouping
 SELECT few.dataa, count(*), min(id), max(id), generate_series(1,3) FROM few GROUP BY few.dataa ORDER BY 5, 1;


On Tue, Jan 31, 2023 at 6:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
This bug seems to have slipped off the radar screen, but it's still
a bug.  I continue to believe that the best fix is to disallow SRFs
in window definitions, and present the trivial patch to do so.
 
This patch fixes the original issue reported by Daniel.  Another issue
discussed in this thread is the weirdness when there are SRFs in the
targetlist of a query with DISTINCT ON clause, as described by David.

create table ab (a int, b int);
insert into ab values(1,1),(1,2),(2,1),(2,2);

# select distinct on (a) a, b, generate_series(1,2) g from ab order by a, b;
 a | b | g
---+---+---
 1 | 1 | 1
 1 | 1 | 2
 2 | 1 | 1
 2 | 1 | 2
(4 rows)

# select distinct on (a) a, b, g from ab, lateral generate_series(1,2) as g order by a, b;
 a | b | g
---+---+---
 1 | 1 | 1
 2 | 1 | 1
(2 rows)

According to the doc these two queries are supposed to be 'almost
exactly the same'.  So it's weird to see they give different number of
output rows.

Should we also fix this issue?  If so it seems we need some changes
about postponing SRFs in make_sort_input_target().

Thanks
Richard
On Tue, 31 Jan 2023 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This bug seems to have slipped off the radar screen, but it's still
> a bug.  I continue to believe that the best fix is to disallow SRFs
> in window definitions, and present the trivial patch to do so.

I don't really object to doing this, but I'm not convinced that
someone else won't eventually come along complaining about us changing
this. As far as I'm aware, the behaviour is not wrong today provided
you always use the column so that remove_unused_subquery_outputs()
does not get rid of it.

If we disallow this, then maybe it's worth also giving ourselves a bit
more flexibility to make it work correctly in the future should
someone come along and convince us we shouldn't have disallowed this.
Adjusting  expression_returns_set() to pass it the relevant
PlannerInfo for the Node might be a good idea. If we change our minds
in the future then we could easily then modify
expression_returns_set_walker() to have it check if the given
WindowFunc's WindowClause has SRFs so that we don't remove the unused
WindowFunc subquery columns in remove_unused_subquery_outputs(). I'd
have probably gone with this as a master-only fix, but is quite likely
you're thinking of something that I'm not.  Certainly, the consistency
of when SRFs are evaluated is pretty terrible so I understand the
desire to disallow the WindowClause SRFs so we have one less thing to
get wrong.

David