Re: Rules and Views - Mailing list pgsql-hackers
From | Zeugswetter Andreas SB SD |
---|---|
Subject | Re: Rules and Views |
Date | |
Msg-id | 46C15C39FEB2C44BA555E356FBCD6FA4961E38@m0114.s-mxs.net Whole thread Raw |
In response to | Rules and Views (Curt Sampson <cjs@cynic.net>) |
List | pgsql-hackers |
> Hmm. I think this consideration boils down to whether the WHERE clause > can give different results for rows that appear equal under the rules of > UNION/EXCEPT/INTERSECT. If it gives the same result for any two such > rows, then it's safe to push down; otherwise not. > > It's not too difficult to come up with examples. I invite you to play > with > > select z,length(z) from > (select 'abc '::char(7) as z intersect > select 'abc '::char(8) as z) ss; > > and contemplate the effects of pushing down a qual involving > length(z). I guess that is why e.g. Informix returns 3 for both of them. Imho that makes a lot of sense. The trailing spaces in char's are supposed to be irrellevant. (But iirc this has already been discussed and rejected) > Whether this particular case is very important in the real world is hard > to say. But there might be more-important cases out there. > > And yet, I think we can do it anyway. The score card looks > like this to > me: > > UNION ALL: always safe to push down, since the rows will be passed > independently to the outer WHERE anyway. Yes, that would imho also be the most important optimization. > UNION: it's unspecified which of a set of "equal" rows will be returned, > and therefore the behavior would be unspecified anyway if the outer > WHERE can distinguish the rows - you might get 1 row of the set out or > none. If we push down, then we create a situation where the returned > row will always be one that passes the outer WHERE, but that > is a legal behavior. > > INTERSECT: again it's unspecified which of a set of "equal" rows will be > returned, and so you might get 1 row out or none. If we push down then > it's still unspecified whether you get a row out (example: if the outer > WHERE will pass only for rows of the left table and not the right, then > push down will result in no rows of the "equal" set being emitted, but > that's a legal behavior). > > INTERSECT ALL: if a set of "equal" rows contains M rows from the left > table and N from the right table, you're supposed to get min(M,N) rows > of the set out of the INTERSECT ALL. Again you can't say which of the > set you will get, so the outer WHERE might let anywhere between 0 and > min(M,N) rows out. With push down, M and N will be reduced by the WHERE > before we do the intersection, so you still have 0 to > min(M,N) rows out. > The behavior will change, but it's still legal per spec AFAICT. > > EXCEPT, EXCEPT ALL: the same sort of analysis seems to hold. The imho difficult question is, which select locks down the datatype to use for this column. In a strict sense char(6) and char(7) are not the same type. Since I would certainly not want to be that strict, it imho has to be decided what type the union/intersect... is supposed to use. Informix converts them both to the longer char. I do not think it is valid to return variable length char's. e.g.: create table atab1 (a char(6)); create table atab2 (a char(8)); insert into atab1 values ('abc'); insert into atab2 values ('abc'); create view aview as select * from atab1 union all select * from atab2; select '<'||a||'>' from aview; Informix: (expression) <abc > <abc > PostgreSQL: ?column? ------------<abc ><abc > I am not sure eighter answer is strictly correct. I would probably have expected <abc > <abc > (char(6)) since the first select is supposed to lock down the type, no ? > In short, it looks to me like the spec was carefully designed to allow > push down. Pushing down a condition of this sort *does* change the > behavior, but the new behavior is still within spec. I think this would be a great performance boost for views and thus worth a change in results that are within spec. Would you want to push down always ? There could be outer where clauses, that are so expensive that you would not want to do them twice. If it is all or nothing, I do think pushing down always is better than not. Andreas
pgsql-hackers by date: