Thread: Does psql evaluate OR conditions in order?
Postgres 9.1 Ubuntu something. We have a stored procedure that performs the following evaluations: IF ST_Intersects(line1, line2) OR ST_Intersects(line3, line4) THEN do stuff ELSIF ST_Intersects(line5, line6) OR ST_Intersects(line7, line8) THEN do stuffB END IF; 98% of the time the 2nd half of these IF statements will never be true. ST_Intersects is an expensive operations to perform. Given we're executing this across a lot of data... avoiding expensive calls is a good idea. So, can we count on the evaluation of the 1st half of the IF's being done first before trying the 2nd half, or should we re-write the IF/ELSE to IF ST_Intersects(line1, line2) THEN do stuff ELSIF ST_Intersects(line5, line6) THEN do stuffB ELSIF ST_Intersects(line3, line4) THEN do stuff ELSIF ST_Intersects(line7, line8) THEN do stuffB END IF; Roxanne
HideMe <r36@tara-lu.com> writes: > Postgres 9.1 > Ubuntu something. > We have a stored procedure that performs the following evaluations: > IF ST_Intersects(line1, line2) OR ST_Intersects(line3, line4) THEN > do stuff > ELSIF ST_Intersects(line5, line6) OR ST_Intersects(line7, line8) THEN > do stuffB > END IF; > 98% of the time the 2nd half of these IF statements will never be true. > ST_Intersects is an expensive operations to perform. > Given we're executing this across a lot of data... avoiding expensive > calls is a good idea. > So, can we count on the evaluation of the 1st half of the IF's being > done first before trying the 2nd half, In principle the order of ORs is not guaranteed, but that mainly has to do with execution of query WHERE clauses, wherein the optimizer will feel free to move sub-clauses to different relations, join levels, etc. I think you're reasonably safe to assume that OR is left-to-right in this simple context. > or should we re-write the IF/ELSE to > IF ST_Intersects(line1, line2) THEN > do stuff > ELSIF ST_Intersects(line5, line6) THEN > do stuffB > ELSIF ST_Intersects(line3, line4) THEN > do stuff > ELSIF ST_Intersects(line7, line8) THEN > do stuffB > END IF; That doesn't look like it produces quite the same results ... regards, tom lane
On 1/16/2013 4:40 PM, Tom Lane wrote: > HideMe <r36@tara-lu.com> writes: >> Postgres 9.1 >> Ubuntu something. >> We have a stored procedure that performs the following evaluations: >> IF ST_Intersects(line1, line2) OR ST_Intersects(line3, line4) THEN >> do stuff >> ELSIF ST_Intersects(line5, line6) OR ST_Intersects(line7, line8) THEN >> do stuffB >> END IF; > In principle the order of ORs is not guaranteed, but that mainly has > to do with execution of query WHERE clauses, wherein the optimizer > will feel free to move sub-clauses to different relations, join > levels, etc. I think you're reasonably safe to assume that OR is > left-to-right in this simple context. Thanks Tom... I knew the WHERE clause scenario was not guaranteed. I just didn't know if the psql stored procedure conditional tests behaved like C or ... not. > IF ST_Intersects(line1, line2) THEN > do stuff > ELSIF ST_Intersects(line5, line6) THEN > do stuffB > ELSIF ST_Intersects(line3, line4) THEN > do stuff > ELSIF ST_Intersects(line7, line8) THEN > do stuffB > END IF; > That doesn't look like it produces quite the same results ... > That would be true if you absolutely need the order initially given - but in this case each ST_Intersects is a test for invalid data. This is a collegue's sample, I asked hm ahead of time - order doesn't matter. if any of the 4 conditions results, then invalid data exists in the scenario and we have to do something about it... but only 1 condition should ever be true. again, thanks Tom. Roxanne
On Wednesday, January 16, 2013 at 08:40, Tom Lane wrote: >HideMe (Robert James) writes: >>Postgres 9.1 >>Ubuntu something. > >>We have a stored procedure that performs the following evaluations: > >>IF ST_Intersects(line1, line2) OR ST_Intersects(line3, line4) THEN >>do stuff >>ELSIF ST_Intersects(line5, line6) OR ST_Intersects(line7, line8) THEN >>do stuffB >>END IF; > >>98% of the time the 2nd half of these IF statements will never be true. >>ST_Intersects is an expensive operations to perform. > >>Given we're executing this across a lot of data... avoiding >>expensive calls is a good idea. >>So, can we count on the evaluation of the 1st half of the IF's >>being done first before trying the 2nd half, > >In principle the order of ORs is not guaranteed, but that mainly has to >do with execution of query WHERE clauses, wherein the optimizer will >feel free to move sub-clauses to different relations, join levels, etc. >I think you're reasonably safe to assume that OR is left-to-right in >this simple context. > >>or should we re-write the IF/ELSE to > >>IF ST_Intersects(line1, line2) THEN >>do stuff >>ELSIF ST_Intersects(line5, line6) THEN >>do stuffB >>ELSIF ST_Intersects(line3, line4) THEN >>do stuff >>ELSIF ST_Intersects(line7, line8) THEN >>do stuffB >>END IF; > >That doesn't look like it produces quite the same results ... > Agree. Unless the OP has prior (undeclared) knowledge that only one of the four intersect conditions can ever be true and has just put his majority cases first. AFAICT, if this is the case, the proposed sequence would achieve the optimisations wanted. More generally... In situations like this I keep having to pinch myself and repeat "SQL is not C" <http://www.postgresql.org/docs/9.1/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL> 4.2.14. Expression Evaluation Rules "The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. "Furthermore, if the result of an expression can be determined by evaluating only some parts of it, then other subexpressions might not be evaluated at all. ..." So evaluation order is not deterministic, and might (i.e., may not) show "lazy evaluation". So, as Tom says, it's "reasonably safe to assume", left to right evaluation, but this is only for performance, since getting the correct answer should not rely on this. (Think: really hard to trace/detect intermittent errors.) And, the optimiser might still plan to evaluate the second expression despite having a result in hand. So far fairly obvious (even if it does lack commitment :-) May I extend/generalize the question to confirm the 9.1+ optimiser is taking the function cost parameter into account when planning. Specifically would the optimiser be much more likely (near certain?) to plan for "lazy" evaluation if the function was labelled as very expensive? And, does it matter if "very expensive" is something like 100K just to push the point? Regards Gavan Schneider