Thread: BUG #8396: Window function results differ when selecting from table and view, with where clause
BUG #8396: Window function results differ when selecting from table and view, with where clause
From
paul@weotta.com
Date:
The following bug has been logged on the website: Bug reference: 8396 Logged by: Paul M. Email address: paul@weotta.com PostgreSQL version: 9.3rc1 Operating system: Ubuntu Linux Description: When I select from a view, the where clause in my select statement does not restrict the rows processed by window functions referenced inside the view definition. Thus, if window functions are involved, using a where clause when selecting from a view and using a where clause when selecting directly from the underlying table produces different results. Without wanting to speculate on the cause of the differing results, I will say that this seems to be a case of an issue noted a year ago on Stack Overflow: "Will Postgres push down a WHERE clause into a VIEW with a Window Function (Aggregate)?" http://stackoverflow.com/questions/7533877/ At that time, responder Evan Carroll noted, "I can't think of anyway an un-referenced Window function can change the result if the WHERE was pushed down." This seems to be just such a case. Thanks in advance for looking into this. I've written a test case, which I hope will be helpful... create table plant ( plant_id character varying( 6 ) , city character varying( 25 ) , constraint p_pk_pid primary key ( plant_id ) ); insert into plant ( plant_id, city ) values ( '14 ST', 'San Francisco' ) , ( 'FOLSOM', 'San Francisco' ) , ( 'CHAVEZ', 'San Francisco' ) , ( 'HEINZ', 'Berkeley' ) ; create view plant_extend as select plant_id as plant_id , row_number() over ( partition by city order by plant_id ) as plant_number , count(*) over ( partition by city ) as of_plants_in_city , city as city from plant; -- Despite the where clause, the window functions see all 3 San Francisco plants: select * from plant_extend where plant_id = 'FOLSOM' ; -- But when the query is expressed this way, the window functions see only the Folsom Street plant: select plant_id as plant_id , row_number() over ( partition by city order by plant_id ) as plant_number , count(*) over ( partition by city ) as of_plants_in_city , city as city from plant where plant_id = 'FOLSOM' ;
Re: BUG #8396: Window function results differ when selecting from table and view, with where clause
From
Tom Lane
Date:
paul@weotta.com writes: > When I select from a view, the where clause in my select statement does not > restrict the rows processed by window functions referenced inside the view > definition. Thus, if window functions are involved, using a where clause > when selecting from a view and using a where clause when selecting directly > from the underlying table produces different results. I don't see anything even a little bit surprising about this. A WHERE clause applied to a view should hide some rows that would appear in the view output without it, but it should certainly never change the contents of the rows it does let through. Now, if you put the WHERE clause before the window functions (and WHERE is semantically prior to the evaluation of a SELECT list at the same syntactic level), then the WHERE clause does filter the rows before the window functions see 'em. But a WHERE clause supplied at an outer syntactic level acts after the window functions are processed. Another way to put it is that your query with the view is equivalent to select * from ( select plant_id as plant_id , row_number() over ( partition by city order by plant_id ) as plant_number , count(*) over ( partition by city ) as of_plants_in_city , city as city from plant ) as plant_extend where plant_id = 'FOLSOM' ; which is not at all the same thing as select plant_id as plant_id , row_number() over ( partition by city order by plant_id ) as plant_number , count(*) over ( partition by city ) as of_plants_in_city , city as city from plant where plant_id = 'FOLSOM' ; precisely because the former specifies applying the WHERE filter after the window functions run, while the latter specifies applying it before. The fact that the results do change when you put the WHERE restriction before the window functions is exactly why the optimization discussed in that stackoverflow thread you mention isn't made. In some cases it would theoretically be possible to prove that moving the filtering wouldn't change the results, but Postgres isn't smart enough to do that --- and even if it were, it would not push down the WHERE clause in either this example or the stackoverflow one, because it could/would change the results. Or in short, this isn't a bug, but a counterexample to the stackoverflow discussion. regards, tom lane