Re: Rules and Views - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Rules and Views
Date
Msg-id 7905.1028217750@sss.pgh.pa.us
Whole thread Raw
In response to Re: Rules and Views  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Responses Re: Rules and Views  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: Rules and Views  (Hannu Krosing <hannu@tm.ee>)
List pgsql-hackers
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> For union, queries that want to do something like use a temporary
> sequence to act sort of like rownum and do row limiting.  Admittedly
> that's already pretty much unspecified behavior, but it does change
> the behavior in the place of duplicate removal.  In addition, I think
> using bits of the spec we don't completely support you can have the
> same issue with the undefined behavior of which duplicate is returned
> for values that aren't the same but are equal, for example where the
> duplicate removal is in one collation but the outer comparison has
> a different explicitly given one.

Hmm.  I think this consideration boils down to whether the WHERE clause
can give different results for rows that appear equal under the rules of
UNION/EXCEPT/INTERSECT.  If it gives the same result for any two such
rows, then it's safe to push down; otherwise not.

It's not too difficult to come up with examples.  I invite you to play
with

select z,length(z) from
(select 'abc    '::char(7) as z intersect
select 'abc    '::char(8) as z) ss;

and contemplate the effects of pushing down a qual involving length(z).

Whether this particular case is very important in the real world is hard
to say.  But there might be more-important cases out there.

And yet, I think we can do it anyway.  The score card looks like this to
me:

UNION ALL: always safe to push down, since the rows will be passed
independently to the outer WHERE anyway.

UNION: it's unspecified which of a set of "equal" rows will be returned,
and therefore the behavior would be unspecified anyway if the outer
WHERE can distinguish the rows - you might get 1 row of the set out or
none.  If we push down, then we create a situation where the returned
row will always be one that passes the outer WHERE, but that is a legal
behavior.

INTERSECT: again it's unspecified which of a set of "equal" rows will be
returned, and so you might get 1 row out or none.  If we push down then
it's still unspecified whether you get a row out (example: if the outer
WHERE will pass only for rows of the left table and not the right, then
push down will result in no rows of the "equal" set being emitted, but
that's a legal behavior).

INTERSECT ALL: if a set of "equal" rows contains M rows from the left
table and N from the right table, you're supposed to get min(M,N) rows
of the set out of the INTERSECT ALL.  Again you can't say which of the
set you will get, so the outer WHERE might let anywhere between 0 and
min(M,N) rows out.  With push down, M and N will be reduced by the WHERE
before we do the intersection, so you still have 0 to min(M,N) rows out.
The behavior will change, but it's still legal per spec AFAICT.

EXCEPT, EXCEPT ALL: the same sort of analysis seems to hold.

In short, it looks to me like the spec was carefully designed to allow
push down.  Pushing down a condition of this sort *does* change the
behavior, but the new behavior is still within spec.

The above analysis assumes that the WHERE condition is "stable", ie its
results for a row don't depend on the order in which the rows are tested
or anything as weird as that.  But we're assuming that already when we
push down a qual in a non-set-operation case, I think.

Comments?  Are there any other considerations to worry about?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: WAL file location
Next
From: "Jeff MacDonald"
Date:
Subject: Re: Trimming the Fat, Part Deux ...