Thread: wierd AND condition evaluation for plpgsql
Hi, I just noticed plpgsql evaluates all AND'ed conditions even if the first one fails. Example: elsif TG_OP = ''UPDATE'' and old.type_reponse = ''abandon'' This will break stuff if the trigger is used on INSERT as "old.type_reponse" will be substituted and return an error. Shouldn't plpgsql shortcut AND conditions when a previous one fails, as perl does? -- OENONE: Quoi ? PHEDRE: Je te l'ai prédit, mais tu n'as pas voulu. (Phèdre,J-B Racine, acte 3, scène 3)
Actually, at least in some cases, PG does short-circuit logic: create function seeme() returns bool as ' begin raise notice ''seeme''; return true; end' language plpgsql; joel@joel=# select false and seeme();?column? ----------f (1 row) joel@joel=# select true and seeme(); NOTICE: seeme?column? ----------t (1 row) In your case, the problem is short-circuiting a test, it's that the full statement must be parsed and prepared, and it's probably in this stage that the illegal use of old. in an insert jumps up. HTH. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Louis-David > Mitterrand > Sent: Tuesday, May 28, 2002 3:21 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] wierd AND condition evaluation for plpgsql > > > > Hi, > > I just noticed plpgsql evaluates all AND'ed conditions even if the first > one fails. Example: > > elsif TG_OP = ''UPDATE'' and old.type_reponse = ''abandon'' > > This will break stuff if the trigger is used on INSERT as > "old.type_reponse" will be substituted and return an error. > > Shouldn't plpgsql shortcut AND conditions when a previous one fails, as > perl does? > > -- > OENONE: Quoi ? > PHEDRE: Je te l'ai prédit, mais tu n'as pas voulu. > (Phèdre, J-B Racine, > acte 3, scène 3) > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Louis-David Mitterrand <vindex@apartia.org> writes: > I just noticed plpgsql evaluates all AND'ed conditions even if the first > one fails. Example: > elsif TG_OP = ''UPDATE'' and old.type_reponse = ''abandon'' > This will break stuff if the trigger is used on INSERT as > "old.type_reponse" will be substituted and return an error. I think you are confusing "evaluation" with "syntax checking". Try putting the reference to OLD inside a nested IF command. regards, tom lane
Louis-David Mitterrand writes: > Shouldn't plpgsql shortcut AND conditions when a previous one fails, as > perl does? Shouldn't perl evaluate all operands unconditionally, like plpgsql does? Seriously, if you want to change this you have to complain to the SQL standards committee. -- Peter Eisentraut peter_e@gmx.net
On Tue, 2002-05-28 at 21:52, Peter Eisentraut wrote: > Louis-David Mitterrand writes: > > > Shouldn't plpgsql shortcut AND conditions when a previous one fails, as > > perl does? > > Shouldn't perl evaluate all operands unconditionally, like plpgsql does? > > Seriously, if you want to change this you have to complain to the SQL > standards committee. Is plpgsl a SQL standards committee standard ? and is the following non-standard ? (itest is a 16k row test table with i in 1-16k) hannu=# create sequence itest_seq; CREATE hannu=# select nextval('itest_seq');nextval --------- 1 (1 row) hannu=# select count(*) from itest where false and true;count ------- 0 (1 row) hannu=# select count(*) from itest where false and i = nextval('itest_seq');count ------- 0 (1 row) hannu=# select nextval('itest_seq');nextval --------- 2 (1 row) hannu=# select count(*) from itest where i = nextval('itest_seq');count ------- 0 (1 row) hannu=# select nextval('itest_seq');nextval --------- 16387 (1 row) --------------------- Hannu
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Peter Eisentraut > Sent: Tuesday, May 28, 2002 12:53 PM > To: Louis-David Mitterrand > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] wierd AND condition evaluation for plpgsql > > > Louis-David Mitterrand writes: > > > Shouldn't plpgsql shortcut AND conditions when a previous one fails, as > > perl does? > > Shouldn't perl evaluate all operands unconditionally, like plpgsql does? > > Seriously, if you want to change this you have to complain to the SQL > standards committee. Peter -- But PG does short-circuit for evaluation, doesn't it? His question was confusing evaluation versus syntax checking and statement preparation. create function seeme() returns bool as ' begin raise notice ''seeme''; return true; end' language plpgsql; joel@joel=# select false and seeme();?column? ----------f (1 row) joel@joel=# select true and seeme(); NOTICE: seeme?column? ----------t (1 row) It certainly appears to be short circuiting for "select false and seeme()", for instance. It appears that this isn't short-circuiting by order of expressions, however (as Perl and other languages do); for example, "select seeme() or true" doesn't ever get to seeme(). I assume PG can simply see that the statement "true" will evaluate to true (clever, that PG!), and therefore it doesn't have to evaluate seeme() ? - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
On Wed, 2002-05-29 at 02:36, Joel Burton wrote: > > -----Original Message----- > joel@joel=# select true and seeme(); > NOTICE: seeme > ?column? > ---------- > t > (1 row) > > > It certainly appears to be short circuiting for "select false and seeme()", > for instance. > > It appears that this isn't short-circuiting by order of expressions, however > (as Perl and other languages do); for example, "select seeme() or true" > doesn't ever get to seeme(). I assume PG can simply see that the statement > "true" will evaluate to true (clever, that PG!), and therefore it doesn't > have to evaluate seeme() ? Are these intricacies of SQL standardised anywhere ? I know that gcc and other ccs can achieve different results depending on optimisation level - usually this is considered a bug. But as PG runs always (?) at the maximum optimisation, should there be such guarantees ? Or is it something that should be ind doc's/faq's (- don't rely on side effects) ? ------------------------ Hannu
Hannu Krosing <hannu@tm.ee> writes: > Are these intricacies of SQL standardised anywhere ? SQL92 section 3.3.4.4, "rule evaluation order" appears to sanction PG's behavior. In particular note the part that says syntax rules and access rules are "effectively applied at the same time" (ie, this checking is done before execution starts --- that legitimizes the error originally complained of) and the parts that say that inessential portions of expressions need not be evaluated and that implementations are not required to perform evaluations strictly left-to-right. 3.3.4.4 Rule evaluation order A conforming implementation is not required to perform the exact sequence of actions defined in the GeneralRules, but shall achieve the same effect on SQL-data and schemas as that sequence. The term effectivelyis used to emphasize actions whose effect might be achieved in other ways by an implementation. The Syntax Rules and Access Rules for contained syntactic elements are effectively applied at the same timeas the Syntax Rules and Access Rules for the containing syntactic elements. The General Rules for containedsyntactic elements are effectively applied be- fore the General Rules for the containing syntactic elements.Where the precedence of operators is determined by the Formats of this International Standard or byparentheses, those operators are ef- fectively applied in the order specified by that precedence. Where theprecedence is not determined by the Formats or by parentheses, effective evaluation of expressions is generallyperformed from left to right. However, it is implementation-dependent whether ex- pressions are actuallyevaluated left to right, particularly when operands or operators might cause conditions to be raised or if the results of the expressions can be determined without completely evaluating all parts of the expression.In general, if some syn- tactic element contains more than one other syntactic element, then theGeneral Rules for contained elements that appear earlier in the production for the containing syntactic elementare applied before the General Rules for contained elements that appear later. For example, in the production: <A> ::= <B> <C> the Syntax Rules and Access Rules for <A>, <B>, and <C> are ef- fectively applied simultaneously. The GeneralRules for <B> are applied before the General Rules for <C>, and the General Rules for <A> are appliedafter the General Rules for both <B> and <C>. If the result of an expression or search condition can be deter- mined without completely evaluating all partsof the expression or search condition, then the parts of the expression or search condi- tion whose evaluationis not necessary are called the inessential parts. If the Access Rules pertaining to inessential parts arenot satisfied, then the syntax error or access rule violation exception condition is raised regardless ofwhether or not the inessential parts are actually evaluated. If evaluation of the inessential parts wouldcause an exception condition to be raised, then it is implementation-dependent whether or not that exceptioncondition is raised. regards, tom lane
On Tue, 2002-05-28 at 16:09, Joel Burton wrote: > Actually, at least in some cases, PG does short-circuit logic: > joel@joel=# select false and seeme(); > joel@joel=# select true and seeme(); If seeme() returns NULL, shouldn't both SELECTs return NULL, and therefore not be short-circuit-able? Sorry, I am a little confused. -- Alessio F. Bragadini alessio@albourne.com APL Financial Services http://village.albourne.com Nicosia, Cyprus phone: +357-22-755750 "It is more complicated than you think" -- The Eighth Networking Truth from RFC 1925
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Alessio > Bragadini > Sent: Thursday, May 30, 2002 9:04 AM > To: PostgreSQL Hackers > Subject: Re: [HACKERS] wierd AND condition evaluation for plpgsql > > > On Tue, 2002-05-28 at 16:09, Joel Burton wrote: > > > Actually, at least in some cases, PG does short-circuit logic: > > > joel@joel=# select false and seeme(); > > > joel@joel=# select true and seeme(); > > If seeme() returns NULL, shouldn't both SELECTs return NULL, and > therefore not be short-circuit-able? > > Sorry, I am a little confused. In my example, seeme() returns true, not NULL. However, the short-circuiting came from the other part (the simple true or false) being evaluated first. So, regardless of the returned value of seeme(), "SELECT FALSE AND seeme()" would short-circuit, since "FALSE AND ___" can never be true. Of course, if seemme() returns NULL, then the end result would be false. - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
"Joel Burton" <joel@joelburton.com> writes: >>> Actually, at least in some cases, PG does short-circuit logic: >>> joel@joel=# select false and seeme(); >>> joel@joel=# select true and seeme(); >> If seeme() returns NULL, shouldn't both SELECTs return NULL, and >> therefore not be short-circuit-able? > In my example, seeme() returns true, not NULL. However, the short-circuiting > came from the other part (the simple true or false) being evaluated first. > So, regardless of the returned value of seeme(), "SELECT FALSE AND seeme()" > would short-circuit, since "FALSE AND ___" can never be true. Yes. Per the SQL standard, some cases involving AND and OR can be simplified without evaluating all the arguments, and PG uses this flexibility to the hilt. You might care to read eval_const_expressions() in src/backend/optimizer/util/clauses.c. Some relevant tidbits: * Reduce any recognizably constant subexpressions of the given* expression tree, for example "2 + 2" => "4". More interestingly,*we can reduce certain boolean expressions even when they contain* non-constant subexpressions: "x OR true"=> "true" no matter what* the subexpression x is. (XXX We assume that no such subexpression* will have important side-effects,which is not necessarily a good* assumption in the presence of user-defined functions; do we need a* pg_procflag that prevents discarding the execution of a function?) * We do understand that certain functions may deliver non-constant* results even with constant inputs, "nextval()" beingthe classic* example. Functions that are not marked "immutable" in pg_proc* will not be pre-evaluated here, althoughwe will reduce their* arguments as far as possible. * OR arguments are handled as follows: * non constant: keep * FALSE: drop (does not affectresult) * TRUE: force result to TRUE * NULL: keep only one * We keep one NULL input becauseExecEvalOr returns NULL * when no input is TRUE and at least one is NULL. * AND arguments are handled as follows: * non constant: keep * TRUE: drop (does not affectresult) * FALSE: force result to FALSE * NULL: keep only one * We keep one NULL inputbecause ExecEvalAnd returns NULL * when no input is FALSE and at least one is NULL. Other relevant manipulations include canonicalize_qual() in src/backend/optimizer/prep/prepqual.c (tries to convert boolean WHERE expressions to normal form by application of DeMorgan's laws) and for that matter the entire planner --- the fact that we have a choice of execution plans at all really comes from the fact that we are allowed to evaluate WHERE clauses in any order. So there's not likely to be much support for any proposal that we constrain the evaluation order or guarantee the evaluation or non-evaluation of specific clauses in WHERE. (The XXX comment above is an idle aside, not something that is likely to really happen.) regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Thursday, May 30, 2002 10:44 AM > To: Joel Burton > Cc: Alessio Bragadini; PostgreSQL Hackers > Subject: Re: [HACKERS] wierd AND condition evaluation for plpgsql > > > "Joel Burton" <joel@joelburton.com> writes: > >>> Actually, at least in some cases, PG does short-circuit logic: > >>> joel@joel=# select false and seeme(); > >>> joel@joel=# select true and seeme(); > > >> If seeme() returns NULL, shouldn't both SELECTs return NULL, and > >> therefore not be short-circuit-able? > > > In my example, seeme() returns true, not NULL. However, the > short-circuiting > > came from the other part (the simple true or false) being > evaluated first. > > So, regardless of the returned value of seeme(), "SELECT FALSE > AND seeme()" > > would short-circuit, since "FALSE AND ___" can never be true. > > Yes. Per the SQL standard, some cases involving AND and OR can be > simplified without evaluating all the arguments, and PG uses this > flexibility to the hilt. You might care to read eval_const_expressions() > in src/backend/optimizer/util/clauses.c. Some relevant tidbits: > > * Reduce any recognizably constant subexpressions of the given > * expression tree, for example "2 + 2" => "4". More interestingly, > * we can reduce certain boolean expressions even when they contain > * non-constant subexpressions: "x OR true" => "true" no matter what > * the subexpression x is. (XXX We assume that no such subexpression > * will have important side-effects, which is not necessarily a good > * assumption in the presence of user-defined functions; do we need a > * pg_proc flag that prevents discarding the execution of a function?) > > * We do understand that certain functions may deliver non-constant > * results even with constant inputs, "nextval()" being the classic > * example. Functions that are not marked "immutable" in pg_proc > * will not be pre-evaluated here, although we will reduce their > * arguments as far as possible. > > ... > > Other relevant manipulations include canonicalize_qual() in > src/backend/optimizer/prep/prepqual.c (tries to convert boolean > WHERE expressions to normal form by application of DeMorgan's laws) > and for that matter the entire planner --- the fact that we have > a choice of execution plans at all really comes from the fact that > we are allowed to evaluate WHERE clauses in any order. So there's > not likely to be much support for any proposal that we constrain the > evaluation order or guarantee the evaluation or non-evaluation of > specific clauses in WHERE. (The XXX comment above is an idle aside, > not something that is likely to really happen.) Thanks, Tom, for the pointers to the full story. Is there any generalizable help would could offer to people who write functions that have side effects? Don't use them in WHERE (or ON or HAVING) clauses? Evaluate the function in a earlier db call, then plug the resolved results into the SQL WHERE statement? I've lived without having this bite me; I'd think that side-effect functions would be unusual in a WHERE clause. I'm just wondering if we should work this into the docs somewhere. (Or is it? I took a look, but didn't see anything). - J. Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Knowledge Management & Technology Consultant
"Joel Burton" <joel@joelburton.com> writes: > Is there any generalizable help would could offer to people who write > functions that have side effects? Don't use them in WHERE (or ON or HAVING) > clauses? Evaluate the function in a earlier db call, then plug the resolved > results into the SQL WHERE statement? Certainly putting side-effects into WHERE clauses is a recipe for trouble, and it'd not be a bad idea to point that out in the docs. (I don't think it is mentioned at the moment.) When you really need to control order of evaluation, you can do it using CASE or by pushing the whole expression into a function. But these defeat optimization so should be avoided if possible. regards, tom lane
Joel Burton writes: > I've lived without having this bite me; I'd think that side-effect functions > would be unusual in a WHERE clause. I'm just wondering if we should work > this into the docs somewhere. (Or is it? I took a look, but didn't see > anything). I've written up a section about it which I'll check in momentarily. -- Peter Eisentraut peter_e@gmx.net