Thread: order of clauses
create table vals ( x float, y float ); insert into vals values (2,4); insert into vals values (2,2); insert into vals values (2,1); insert into vals values (2,0); select x/y from vals where y>0 and x/y>1; will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in any order (A and B = B and A). I obviously would like (y>0) to happen first, but I don't see how this can be achieved.. Any ideas? Cheers, Patrick
Well, it doesn't solve the ordering question, but you could use a where something like this I guess: where y>0 and (x/(case when y=0 then 1 else y end))>1 On Wed, 14 Feb 2001, Patrick Welche wrote: > create table vals ( > x float, > y float > ); > insert into vals values (2,4); > insert into vals values (2,2); > insert into vals values (2,1); > insert into vals values (2,0); > select x/y from vals where y>0 and x/y>1; > > will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in > any order (A and B = B and A). I obviously would like (y>0) to happen first, > but I don't see how this can be achieved.. Any ideas?
You didn't mention what version of Postgres, but in 7.1beta, you could do the following (pretty sure on the syntax): SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x / b.y) > 1; In anything else, you could try a view: CREATE VIEW valid_vals AS SELECT y FROM vals WHERE y > 0; SELECT a.x/b.y FROM vals a, valid_vals b WHERE (a.x / b.y) > 1 Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Wed, 14 Feb 2001, Patrick Welche wrote: > create table vals ( > x float, > y float > ); > insert into vals values (2,4); > insert into vals values (2,2); > insert into vals values (2,1); > insert into vals values (2,0); > select x/y from vals where y>0 and x/y>1; > > will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in > any order (A and B = B and A). I obviously would like (y>0) to happen first, > but I don't see how this can be achieved.. Any ideas? > > Cheers, > > Patrick >
Patrick Welche <prlw1@newn.cam.ac.uk> writes: > select x/y from vals where y>0 and x/y>1; > will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in > any order (A and B = B and A). I obviously would like (y>0) to happen first, > but I don't see how this can be achieved.. Any ideas? Of course you can rewrite this particular case to avoid the division, but I suppose you are looking for a more general answer. Consider something like CASE WHEN y > 0 THEN x/y > 1 ELSE false END I think that right now, the planner gratuitously reverses the order of the WHERE clauses that it's unable to convert to index/join quals, thus your failure. So you could hack around the problem just by switching the two conditions. I've been meaning to try to figure out where the reversal is happening and undo it, however, so this behavior should not be considered to be documented/supported/guaranteed. regards, tom lane
: SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x : / b.y) > 1; How much of a performance hit is there when using a select in the FROM clause? Is it even noticeable? How much better is it to create a static view? -Dan
> > will give a divide by zero error as A=(y>0) and B=(x/y>1) can be evaluated in > > any order (A and B = B and A). I obviously would like (y>0) to happen first, > > but I don't see how this can be achieved.. Any ideas? I have one idea that would be nifty to implement. In some compilers, you can turn off complete boolean checking. As soon as any part of an expression will invalidate the expression, it stops evaluating all of it. That can help you avoid division by zero, and keeps you from evaluating parts of the expression that don't matter. It sounds like a good idea, at least to an ignoramus like me. : ) steve
Are you referring to short circuit? That's a language feature, isn't it? I didn't think it had anything to do with the compiler (I know C and a few other languages do it). Anyway, I could be wrong.. Seems that could break a lot of code if the programmer relies on short circuit in some conditional statements. if ( whatever() OR something() ) { blah(); } -- if "whatever" evaluates to true, then "something" isn't executed (the whole statement is true if one is true)... This really only comes into play when you're comparing the values returned by something (a method, function, etc), if you're just looking at boolean variable I guess it doesn't matter. -Mitch ----- Original Message ----- From: "Steve Wolfe" <steve@iboats.com> To: <pgsql-general@postgresql.org> Sent: Friday, February 16, 2001 3:10 PM Subject: Re: order of clauses > > > will give a divide by zero error as A=(y>0) and B=(x/y>1) can be > evaluated in > > > any order (A and B = B and A). I obviously would like (y>0) to happen > first, > > > but I don't see how this can be achieved.. Any ideas? > > I have one idea that would be nifty to implement. In some compilers, you > can turn off complete boolean checking. As soon as any part of an > expression will invalidate the expression, it stops evaluating all of it. > That can help you avoid division by zero, and keeps you from evaluating > parts of the expression that don't matter. It sounds like a good idea, at > least to an ignoramus like me. : ) > > steve > > >
Dan Wilson writes: > : SELECT a.x/b.y FROM vals a, (SELECT y FROM vals WHERE y > 0) b WHERE (a.x > : / b.y) > 1; > > How much of a performance hit is there when using a select in the FROM > clause? Is it even noticeable? How much better is it to create a static > view? Subselects in FROM are currently not the most performance-encouraging way to write a query (explicit JOINs might use more efficient plans), but setting up a view is going to buy you zero because a query from a view is just going to end up being processed like a subselect in FROM. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > Subselects in FROM are currently not the most performance-encouraging way > to write a query (explicit JOINs might use more efficient plans), but > setting up a view is going to buy you zero because a query from a view is > just going to end up being processed like a subselect in FROM. Actually there's more to it than that. A view will indeed be expanded into something that looks exactly like a subselect-in-FROM (think of the view as a macro that gets expanded). But downstream of that, the planner will try to "pull up" the subselect into the main query if it's simple enough. If the pullup is successful, then there's no performance penalty to having written a view rather than an explicit join. What's more, because it's done that way, the same applies to explicitly written subselect-in-FROM. For example, consider this literal join: regression=# explain select * from int8_tbl a, int8_tbl b where a.q1=b.q2; NOTICE: QUERY PLAN: Merge Join (cost=2.22..2.34 rows=5 width=32) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl b (cost=0.00..1.05 rows=5 width=16) EXPLAIN If you recast this with a subselect, it still gets the same plan because the subselect is absorbed into the upper query: regression=# explain select * from int8_tbl a, regression-# (select * from int8_tbl) b where a.q1=b.q2; NOTICE: QUERY PLAN: Merge Join (cost=2.22..2.34 rows=5 width=32) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl (cost=0.00..1.05 rows=5 width=16) EXPLAIN However the planner is not currently bright enough to pull up a subquery with, say, an ORDER BY: regression=# explain select * from int8_tbl a, regression-# (select * from int8_tbl order by q1) b where a.q1=b.q2; NOTICE: QUERY PLAN: Merge Join (cost=2.27..2.40 rows=5 width=32) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl a (cost=0.00..1.05 rows=5 width=16) -> Sort (cost=1.17..1.17 rows=5 width=16) -> Subquery Scan b (cost=1.11..1.11 rows=5 width=16) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on int8_tbl (cost=0.00..1.05 rows=5 width=16) EXPLAIN regression=# In this case the outer join plan doesn't change, but it might if we were considering something where a nestloop with inner indexscan would have been the best plan. An indexscan has to be on a base relation, not on a SubqueryScan node, so the planner would fail to discover the best plan without the pullup. The results would be the same if I'd done CREATE VIEWS rather than writing explicit subselect-in-FROM. regards, tom lane