Re: BUG #8396: Window function results differ when selecting from table and view, with where clause - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #8396: Window function results differ when selecting from table and view, with where clause
Date
Msg-id 20121.1377447767@sss.pgh.pa.us
Whole thread Raw
In response to BUG #8396: Window function results differ when selecting from table and view, with where clause  (paul@weotta.com)
List pgsql-bugs
paul@weotta.com writes:
> When I select from a view, the where clause in my select statement does not
> restrict the rows processed by window functions referenced inside the view
> definition. Thus, if window functions are involved, using a where clause
> when selecting from a view and using a where clause when selecting directly
> from the underlying table produces different results.

I don't see anything even a little bit surprising about this.  A WHERE
clause applied to a view should hide some rows that would appear in the
view output without it, but it should certainly never change the contents
of the rows it does let through.  Now, if you put the WHERE clause before
the window functions (and WHERE is semantically prior to the evaluation of
a SELECT list at the same syntactic level), then the WHERE clause does
filter the rows before the window functions see 'em.  But a WHERE clause
supplied at an outer syntactic level acts after the window functions are
processed.

Another way to put it is that your query with the view is equivalent to

select
  *
from ( select
  plant_id as plant_id
, row_number() over ( partition by city order by plant_id ) as plant_number
, count(*) over ( partition by city ) as of_plants_in_city
, city as city
from plant ) as plant_extend
where
  plant_id = 'FOLSOM'
;

which is not at all the same thing as

select
  plant_id as plant_id
, row_number() over ( partition by city order by plant_id ) as plant_number
, count(*) over ( partition by city ) as of_plants_in_city
, city as city
from plant
where
  plant_id = 'FOLSOM'
;

precisely because the former specifies applying the WHERE filter after
the window functions run, while the latter specifies applying it before.

The fact that the results do change when you put the WHERE restriction
before the window functions is exactly why the optimization discussed
in that stackoverflow thread you mention isn't made.  In some cases it
would theoretically be possible to prove that moving the filtering
wouldn't change the results, but Postgres isn't smart enough to do that
--- and even if it were, it would not push down the WHERE clause in either
this example or the stackoverflow one, because it could/would change the
results.

Or in short, this isn't a bug, but a counterexample to the stackoverflow
discussion.

            regards, tom lane

pgsql-bugs by date:

Previous
From: TAKATSUKA Haruka
Date:
Subject: Re: BUG #8397: pg_basebackup -x from new standby server sometimes causes Segmentation fault
Next
From: Magnus Hagander
Date:
Subject: Re: BUG #8397: pg_basebackup -x from new standby server sometimes causes Segmentation fault