Re: Rules and Views - Mailing list pgsql-hackers

From Stephan Szabo
Subject Re: Rules and Views
Date
Msg-id 20020801091507.Y29244-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Re: Rules and Views  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Rules and Views  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 1 Aug 2002, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > For union, queries that want to do something like use a temporary
> > sequence to act sort of like rownum and do row limiting.  Admittedly
> > that's already pretty much unspecified behavior, but it does change
> > the behavior in the place of duplicate removal.  In addition, I think
> > using bits of the spec we don't completely support you can have the
> > same issue with the undefined behavior of which duplicate is returned
> > for values that aren't the same but are equal, for example where the
> > duplicate removal is in one collation but the outer comparison has
> > a different explicitly given one.
>
> Hmm.  I think this consideration boils down to whether the WHERE clause
> can give different results for rows that appear equal under the rules of
> UNION/EXCEPT/INTERSECT.  If it gives the same result for any two such
> rows, then it's safe to push down; otherwise not.
>
> It's not too difficult to come up with examples.  I invite you to play
> with
>
> select z,length(z) from
> (select 'abc    '::char(7) as z intersect
> select 'abc    '::char(8) as z) ss;
>
> and contemplate the effects of pushing down a qual involving length(z).
>
> Whether this particular case is very important in the real world is hard
> to say.  But there might be more-important cases out there.
>
> And yet, I think we can do it anyway.  The score card looks like this to
> me:
>
> UNION ALL: always safe to push down, since the rows will be passed
> independently to the outer WHERE anyway.
>
> UNION: it's unspecified which of a set of "equal" rows will be returned,
> and therefore the behavior would be unspecified anyway if the outer
> WHERE can distinguish the rows - you might get 1 row of the set out or
> none.  If we push down, then we create a situation where the returned
> row will always be one that passes the outer WHERE, but that is a legal
> behavior.
>
> INTERSECT: again it's unspecified which of a set of "equal" rows will be
> returned, and so you might get 1 row out or none.  If we push down then
> it's still unspecified whether you get a row out (example: if the outer
> WHERE will pass only for rows of the left table and not the right, then
> push down will result in no rows of the "equal" set being emitted, but
> that's a legal behavior).
>
> INTERSECT ALL: if a set of "equal" rows contains M rows from the left
> table and N from the right table, you're supposed to get min(M,N) rows
> of the set out of the INTERSECT ALL.  Again you can't say which of the
> set you will get, so the outer WHERE might let anywhere between 0 and
> min(M,N) rows out.  With push down, M and N will be reduced by the WHERE
> before we do the intersection, so you still have 0 to min(M,N) rows out.
> The behavior will change, but it's still legal per spec AFAICT.
>

> EXCEPT, EXCEPT ALL: the same sort of analysis seems to hold.

Actually I think in except you may only push down to the left, since in
this case you know that any duplicate from the right will not be
returned (since there must be none).  So, you can't potentially drop
a row from the right side that may have been a duplicate of a left
side row that does match the condition.

If we assume two collations one case sensitive one not with the
except in the non-sensitive and the where in the sensitive and
a left with 'A' and right with 'a', it'd be incorrect to push a
case sensitive where foo='A' down to the right since that'd change the
output from zero rows to one.

Something similar for except all since lowering the number of rows
on the right can increase the number of returned rows above
m-n (if say all m dups match the condition and none of n do)


> The above analysis assumes that the WHERE condition is "stable", ie its
> results for a row don't depend on the order in which the rows are tested
> or anything as weird as that.  But we're assuming that already when we
> push down a qual in a non-set-operation case, I think.

In which case we don't have to worry about the nextval() case.



pgsql-hackers by date:

Previous
From: Lamar Owen
Date:
Subject: Re: Trimming the Fat, Part Deux ...
Next
From: Bruce Momjian
Date:
Subject: Re: Open 7.3 items