Thread: Heavy Function Optimisation
Hi, In a projet, I have an heavy fonction that double the time of the query. I was surprised because the function was IMMUTABLE but no cache happens. So I wrote a small test. test.sql --------------------------------------- \timing on CREATE OR REPLACE FUNCTION dum(a int) RETURNS int LANGUAGE SQL STRICT IMMUTABLE AS $$ SELECT pg_sleep(1); SELECT 1000+$1; $$; SELECT dum(a) FROM ( SELECT 1::int AS a UNION ALL SELECT 2::int AS a UNION ALL SELECT 2::int AS a UNION ALL SELECT 3::int AS a UNION ALL SELECT 3::int AS a UNION ALL SELECT 3::int AS a ) t; WITH data AS ( SELECT 1::int AS a UNION ALL SELECT 2::int AS a UNION ALL SELECT 2::int AS a UNION ALL SELECT 3::int AS a UNION ALL SELECT 3::int AS a UNION ALL SELECT 3::int AS a) ,map AS (SELECT a, dum(a) FROM data GROUP BY a) SELECT m.dum FROM data AS d JOIN map AS m ON d.a=m.a; --------------------------------------- test=# \i test.sql Timing is on. CREATE FUNCTION Time: 1.479 ms dum ------ 1001 1002 1002 1003 1003 1003 (6 rows) Time: 6084.172 ms a | dum ---+------ 1 | 1001 2 | 1002 2 | 1002 3 | 1003 3 | 1003 3 | 1003 (6 rows) Time: 3029.617 ms I was expecting the first query takes only 3 seconds, because I was (wrongly) thinking the results of the computation ofthe function computation was cached. So I emulate it with the WITH query to compute only one time by value the function dum. Do you think, this optimisation may be added to the optimizer ? -- Cordialement, Jean-Gérard Pailloncy
Hi, In PostgreSQL 9.2, I have the following behavior, and I found it strange. ps3 is executed or "never executed" ? !!! JG [postgres@]test=# create or replace function ps3(a int) returns int as $$ BEGIN RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION Temps : 22,632 ms [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 coalesce ---------- 1 (1 ligne) Temps : 0,692 ms [postgres@]test=# select coalesce( ps3(1), ps3(2) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 ligne) Temps : 0,441 ms [postgres@]test=# explain (analyze, verbose, buffers) select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 QUERY PLAN -------------------------------------------------------------------------------------------- Result (cost=0.02..0.03 rows=1 width=0) (actual time=0.006..0.006 rows=1 loops=1) Output: COALESCE($0, $1) InitPlan 1 (returns $0) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) Output: 1 InitPlan 2 (returns $1) -> Result (cost=0.00..0.01 rows=1 width=0) (never executed) Output: 2 Total runtime: 0.024 ms (9 lignes) Temps : 0,819 ms
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of jg > Sent: Friday, December 21, 2012 10:04 AM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Coalesce bug ? > > Hi, > > In PostgreSQL 9.2, I have the following behavior, and I found it strange. > > ps3 is executed or "never executed" ? !!! > > JG > > [postgres@]test=# create or replace function ps3(a int) returns int as $$ > BEGIN RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; END; $$ > LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION Temps : 22,632 > ms [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); > WARNING: Call ps3(1)=1 > WARNING: Call ps3(2)=2 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,692 ms > [postgres@]test=# select coalesce( ps3(1), ps3(2) ); > WARNING: Call ps3(1)=1 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,441 ms > > [postgres@]test=# explain (analyze, verbose, buffers) select coalesce( > (select ps3(1)), (SELECT ps3(2)) ); > WARNING: Call ps3(1)=1 > WARNING: Call ps3(2)=2 > QUERY PLAN > > -------------------------------------------------------------------------------------------- > Result (cost=0.02..0.03 rows=1 width=0) (actual time=0.006..0.006 rows=1 > loops=1) > Output: COALESCE($0, $1) > InitPlan 1 (returns $0) > -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 > rows=1 loops=1) > Output: 1 > InitPlan 2 (returns $1) > -> Result (cost=0.00..0.01 rows=1 width=0) (never executed) > Output: 2 > Total runtime: 0.024 ms > (9 lignes) > > Temps : 0,819 ms > You have defined the function as "IMMUTABLE". The system is allowed to cache the results of a given call (i.e. "ps3(2)")and return the value without actually executing the function ("never executed"). Your second example returns "1"without a warning regarding the "2" invocation due to this. The Query Plan you show also matches this behavior. I am curious as to why the Explain Analyze version has both warnings yet indicates that the cache was used. I would askthat you confirm that query plan shown was generated at the same time as the two warnings and that it is not a copy-and-paste/timingerror. While unusual the contract of IMMUTABLE does not supposedly preclude this mismatch. However,I have to leave it to more knowledgeable people to confirm, research, and explain this behavior. David J.
Hi, > In PostgreSQL 9.2, I have the following behavior, and I found it strange. Sorry the test was with 9.1.6 # psql -V psql (PostgreSQL) 9.1.6 JG
On 12/21/2012 07:03 AM, jg wrote: > Hi, > > In PostgreSQL 9.2, I have the following behavior, and I found it strange. > > ps3 is executed or "never executed" ? !!! > > JG > > [postgres@]test=# create or replace function ps3(a int) returns int as $$ BEGIN > RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; > END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; > CREATE FUNCTION > Temps : 22,632 ms > [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); > WARNING: Call ps3(1)=1 > WARNING: Call ps3(2)=2 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,692 ms > [postgres@]test=# select coalesce( ps3(1), ps3(2) ); > WARNING: Call ps3(1)=1 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,441 ms Well I believe there is no bug. In the first case the nested parentheses mean the SELECT statements are run first and the COALESCE is run on the return results. In the second case COALESCE is working as advertised. Working left to right it finds the first non NULL argument and stops. -- Adrian Klaver adrian.klaver@gmail.com
On Fri, Dec 21, 2012 at 8:55 AM, jg <jg@rilk.com> wrote: > Hi, > > In a projet, I have an heavy fonction that double the time of the query. > I was surprised because the function was IMMUTABLE but no cache happens. > So I wrote a small test. > > test.sql > --------------------------------------- > \timing on > > CREATE OR REPLACE FUNCTION dum(a int) > RETURNS int > LANGUAGE SQL > STRICT IMMUTABLE > AS $$ > SELECT pg_sleep(1); > SELECT 1000+$1; > $$; > > SELECT dum(a) FROM ( > SELECT 1::int AS a UNION ALL > SELECT 2::int AS a UNION ALL > SELECT 2::int AS a UNION ALL > SELECT 3::int AS a UNION ALL > SELECT 3::int AS a UNION ALL > SELECT 3::int AS a > ) t; > > WITH data AS ( > SELECT 1::int AS a UNION ALL > SELECT 2::int AS a UNION ALL > SELECT 2::int AS a UNION ALL > SELECT 3::int AS a UNION ALL > SELECT 3::int AS a UNION ALL > SELECT 3::int AS a) > ,map AS (SELECT a, dum(a) FROM data GROUP BY a) > SELECT m.dum FROM data AS d JOIN map AS m ON d.a=m.a; > --------------------------------------- > > test=# \i test.sql > Timing is on. > CREATE FUNCTION > Time: 1.479 ms > dum > ------ > 1001 > 1002 > 1002 > 1003 > 1003 > 1003 > (6 rows) > > Time: 6084.172 ms > a | dum > ---+------ > 1 | 1001 > 2 | 1002 > 2 | 1002 > 3 | 1003 > 3 | 1003 > 3 | 1003 > (6 rows) > > Time: 3029.617 ms > > I was expecting the first query takes only 3 seconds, because I was (wrongly) thinking the results of the computation ofthe function computation was cached. > So I emulate it with the WITH query to compute only one time by value the function dum. > > Do you think, this optimisation may be added to the optimizer ? Probably not in the sense that you mean. IMMUTABLE functions don't mean the input to output values are cached. What it does mean is that the function can be used in cases where immutable semantics are required (like create index) and that, as with STABLE, the function call can be moved around so that more or less calls are made as long as the final results are the same. IMMUTABLE functions can also in some special cases be resolved at plan time so the results are reused if all the inputs are known. merlin
On Sat, Dec 22, 2012 at 2:25 AM, David Johnston <polobo@yahoo.com> wrote: > You have defined the function as "IMMUTABLE". The system is allowed to cache the results of a given call (i.e. "ps3(2)")and return the value without actually executing the function ("never executed"). Your second example returns "1"without a warning regarding the "2" invocation due to this. The Query Plan you show also matches this behavior. Further reading on this subject: http://www.postgresql.org/docs/9.1/static/xfunc-volatility.html See particularly the second paragraph after the bullet list. ChrisA
Hi, Please test this script on a PostgreSQL 9.1.6, create or replace function ps3(a int) returns int as $$ BEGIN RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; SELECT ps3(1); SELECT ps3(2); select coalesce( (select ps3(1)), (SELECT ps3(2)) ); explain (verbose, analyze, buffers) select coalesce( (select ps3(1)), (SELECT ps3(2)) ); select coalesce( ps3(1), ps3(2) ); The result will be [postgres@]test=# create or replace function ps3(a int) returns int as $$ BEGIN test$# RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; test$# END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION Temps : 13,232 ms [postgres@]test=# SELECT ps3(1); WARNING: Call ps3(1)=1 ps3 ----- 1 (1 ligne) Temps : 0,975 ms [postgres@]test=# SELECT ps3(2); WARNING: Call ps3(2)=2 ps3 ----- 2 (1 ligne) Temps : 0,473 ms [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 coalesce ---------- 1 (1 ligne) Temps : 0,681 ms [postgres@]test=# explain (verbose, analyze, buffers) select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 QUERY PLAN -------------------------------------------------------------------------------- ------------ Result (cost=0.02..0.03 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops =1) Output: COALESCE($0, $1) InitPlan 1 (returns $0) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows =1 loops=1) Output: 1 InitPlan 2 (returns $1) -> Result (cost=0.00..0.01 rows=1 width=0) (never executed) Output: 2 Total runtime: 0.022 ms (9 lignes) Temps : 0,774 ms [postgres@]test=# select coalesce( ps3(1), ps3(2) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 ligne) Temps : 0,562 ms [postgres@]test=# There is a bug ! Thank you for the documentation link, but it does not help me. JG
Hi, Test done on PostgreSQL 9.2.1 pgb=# create or replace function ps3(a int) returns int as $$ BEGIN pgb$# RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; pgb$# END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION pgb=# SELECT ps3(1); WARNING: Call ps3(1)=1 ps3 ----- 1 (1 row) pgb=# SELECT ps3(2); WARNING: Call ps3(2)=2 ps3 ----- 2 (1 row) pgb=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 coalesce ---------- 1 (1 row) pgb=# explain (verbose, analyze, buffers) select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 WARNING: Call ps3(2)=2 QUERY PLAN -------------------------------------------------------------------------------------------- Result (cost=0.02..0.03 rows=1 width=0) (actual time=0.014..0.015 rows=1 loops=1) Output: COALESCE($0, $1) InitPlan 1 (returns $0) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.003 rows=1 loops=1) Output: 1 InitPlan 2 (returns $1) -> Result (cost=0.00..0.01 rows=1 width=0) (never executed) Output: 2 Total runtime: 0.079 ms (9 rows) pgb=# select coalesce( ps3(1), ps3(2) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 row) There is a bug too. JG
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Adrian Klaver > Sent: Friday, December 21, 2012 10:27 AM > To: jg > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Coalesce bug ? > > On 12/21/2012 07:03 AM, jg wrote: > > Hi, > > > > In PostgreSQL 9.2, I have the following behavior, and I found it strange. > > > > ps3 is executed or "never executed" ? !!! > > > > JG > > > > [postgres@]test=# create or replace function ps3(a int) returns int as > > $$ BEGIN RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; END; $$ > > LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION Temps : 22,632 > ms > > [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); > > WARNING: Call ps3(1)=1 > > WARNING: Call ps3(2)=2 > > coalesce > > ---------- > > 1 > > (1 ligne) > > > > Temps : 0,692 ms > > [postgres@]test=# select coalesce( ps3(1), ps3(2) ); > > WARNING: Call ps3(1)=1 > > coalesce > > ---------- > > 1 > > (1 ligne) > > > > Temps : 0,441 ms > > Well I believe there is no bug. > > In the first case the nested parentheses mean the SELECT statements are run > first and the COALESCE is run on the return results. The first case is: SELECT COALESCE( (SELECT), (SELECT) ); I am not sure exactly what the parentheses surrounding the scalar-sub-SELECTs do(turn them into anonymously typed rows?) but if the first scalar-sub-select results in a non-null result then the secondone should not be executed. Also, the Query Plan shown "never executed" the second scalar-sub-SELECT (from the samequery form) yet it knows that the result of the second call was "OUTPUT: 2" > > In the second case COALESCE is working as advertised. Working left to right it > finds the first non NULL argument and stops. > I thought that in order to call the Coalesce function the system would have to know the value of all parameters. There isno lazy instantiation in SQL. Both "SELECT" statements because they have to be run before the COALESCE function call beevaluated. Whether the ps3(?) function has to be executed then only depends on whether enough information exists in memoryto optimize the call away. David J.
On Sat, Dec 22, 2012 at 2:40 AM, jg <jg@rilk.com> wrote: > Thank you for the documentation link, but it does not help me. The documentation link states that a function with side effects *must* to be declared VOLATILE (or if you prefer, *not* declared STRICT or IMMUTABLE). Emitting warnings is a side effect; you're declaring your functions IMMUTABLE. According to the documentation, PostgreSQL is permitted, under the circumstances, to behave any way it likes - up to and including launching nuclear missiles and unleashing Terminators on the world. Does the problem disappear if the functions aren't IMMUTABLE? ChrisA
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of jg > Sent: Friday, December 21, 2012 10:40 AM > To: Chris Angelico > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Coalesce bug ? > > Hi, > > Please test this script on a PostgreSQL 9.1.6, > > create or replace function ps3(a int) returns int as $$ BEGIN RAISE WARNING > 'Call ps3(%)=%',$1,$1; RETURN $1::int; END; $$ LANGUAGE plpgsql STRICT > IMMUTABLE; SELECT ps3(1); SELECT ps3(2); select coalesce( (select ps3(1)), > (SELECT ps3(2)) ); explain (verbose, analyze, buffers) select coalesce( (select > ps3(1)), (SELECT ps3(2)) ); select coalesce( ps3(1), ps3(2) ); > > > The result will be > > [postgres@]test=# create or replace function ps3(a int) returns int as $$ > BEGIN test$# RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; test$# > END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; CREATE FUNCTION Temps : > 13,232 ms [postgres@]test=# SELECT ps3(1); > WARNING: Call ps3(1)=1 > ps3 > ----- > 1 > (1 ligne) > > Temps : 0,975 ms > [postgres@]test=# SELECT ps3(2); > WARNING: Call ps3(2)=2 > ps3 > ----- > 2 > (1 ligne) > > Temps : 0,473 ms > [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); > WARNING: Call ps3(1)=1 > WARNING: Call ps3(2)=2 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,681 ms > [postgres@]test=# explain (verbose, analyze, buffers) select coalesce( > (select ps3(1)), (SELECT ps3(2)) ); > WARNING: Call ps3(1)=1 > WARNING: Call ps3(2)=2 > QUERY PLAN > > -------------------------------------------------------------------------------- > ------------ > Result (cost=0.02..0.03 rows=1 width=0) (actual time=0.004..0.004 rows=1 > loops > =1) > Output: COALESCE($0, $1) > InitPlan 1 (returns $0) > -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows > =1 loops=1) > Output: 1 > InitPlan 2 (returns $1) > -> Result (cost=0.00..0.01 rows=1 width=0) (never executed) > Output: 2 > Total runtime: 0.022 ms > (9 lignes) > > Temps : 0,774 ms > [postgres@]test=# select coalesce( ps3(1), ps3(2) ); > WARNING: Call ps3(1)=1 > coalesce > ---------- > 1 > (1 ligne) > > Temps : 0,562 ms > [postgres@]test=# > > > There is a bug ! In words, what behavior in the above do you find "buggy" and what output would you expect to see instead. The use of "RAISE NOTICE" in an IMMUTABLE function is a grey area since it is arguably a side-effect though a benign one. It is not a bug for an IMMUTABLE function to NOT be executed if the result can be known by other means. It is also not anerror for an IMMUTABLE function to be executed even if you believe those "other means" should have been used instead. The presence of IMMUTABLE gives the system a choice of how to proceed - as long as whichever choice it picks does not changethe semantics of the output. If you find the "RAISE NOTICE" to be semantically meaningful then you MUST NOT use IMMUTABLEsince in that case you are explicitly making use of a side-effect. > > Thank you for the documentation link, but it does not help me. > > JG
On 12/21/2012 07:49 AM, David Johnston wrote: >> -----Original Message----- >> >> In the first case the nested parentheses mean the SELECT statements are run >> first and the COALESCE is run on the return results. > > The first case is: > > SELECT COALESCE( (SELECT), (SELECT) ); I am not sure exactly what the parentheses surrounding the scalar-sub-SELECTs do(turn them into anonymously typed rows?) but if the first scalar-sub-select results in a non-null result then the secondone should not be executed. Also, the Query Plan shown "never executed" the second scalar-sub-SELECT (from the samequery form) yet it knows that the result of the second call was "OUTPUT: 2" > >> >> In the second case COALESCE is working as advertised. Working left to right it >> finds the first non NULL argument and stops. >> > > I thought that in order to call the Coalesce function the system would have to know the value of all parameters. Thereis no lazy instantiation in SQL. Both "SELECT" statements because they have to be run before the COALESCE functioncall be evaluated. Whether the ps3(?) function has to be executed then only depends on whether enough informationexists in memory to optimize the call away. I am just going by the docs and what my observations have been:) http://www.postgresql.org/docs/9.2/interactive/functions-conditional.html#FUNCTIONS-COALESCE-NVL-IFNULL "Like a CASE expression, COALESCE only evaluates the arguments that are needed to determine the result; that is, arguments to the right of the first non-null argument are not evaluated. " > > David J. > > > > -- Adrian Klaver adrian.klaver@gmail.com
Hi, Interesting idea. With VOLATILE, the bug disappears. With IMMUTABLE, the EXPLAIN and the execution does not match !!!! That is a bug. Even if the behavior has to be different in VOLATILE and IMMUTABLE, the EXPLAIN and the execution MUST becoherent. JG [postgres@]test=# create or replace function ps3(a int) returns int as $$ BEGIN test$# RAISE WARNING 'Call ps3(%)=%',$1,$1; RETURN $1::int; test$# END; $$ LANGUAGE plpgsql STRICT VOLATILE; CREATE FUNCTION Temps : 127,417 ms [postgres@]test=# SELECT ps3(1); WARNING: Call ps3(1)=1 ps3 ----- 1 (1 ligne) Temps : 0,941 ms [postgres@]test=# SELECT ps3(2); WARNING: Call ps3(2)=2 ps3 ----- 2 (1 ligne) Temps : 0,413 ms [postgres@]test=# select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 ligne) Temps : 0,501 ms [postgres@]test=# explain (verbose, analyze, buffers) select coalesce( (select ps3(1)), (SELECT ps3(2)) ); WARNING: Call ps3(1)=1 QUERY PLAN -------------------------------------------------------------------------------- ------------ Result (cost=0.52..0.53 rows=1 width=0) (actual time=0.072..0.072 rows=1 loops =1) Output: COALESCE($0, $1) InitPlan 1 (returns $0) -> Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.067..0.067 rows =1 loops=1) Output: ps3(1) InitPlan 2 (returns $1) -> Result (cost=0.00..0.26 rows=1 width=0) (never executed) Output: ps3(2) Total runtime: 0.095 ms (9 lignes) Temps : 0,630 ms [postgres@]test=# select coalesce( ps3(1), ps3(2) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 ligne) Temps : 0,451 ms [postgres@]test=#
On Sat, Dec 22, 2012 at 2:57 AM, jg <jg@rilk.com> wrote: > Hi, > > Interesting idea. > With VOLATILE, the bug disappears. > With IMMUTABLE, the EXPLAIN and the execution does not match !!!! > That is a bug. Even if the behavior has to be different in VOLATILE and IMMUTABLE, the EXPLAIN and the execution MUST becoherent. > JG Just a word about your wording. Calling this a bug is somewhat poor form; you're sounding hostile and accusatory, rather than looking to learn and understand. It might indeed turn out to be a bug, but from what else has happened in this thread, I'm more inclined to think that the database is fine and your code is what's not working. http://www.catb.org/esr/faqs/smart-questions.html#idp29846432 It's almost as if he read this very thread :) Something to consider: Since you've told Postgres that your function is immutable, it might be remembering the result from the first execution and using it in the second. Try restarting the server between the EXPLAIN and the test. By the way, why do you declare your functions as "STRICT IMMUTABLE" and "STRICT VOLATILE"? ChrisA
"David Johnston" <polobo@yahoo.com> writes: > The first case is: > SELECT COALESCE( (SELECT), (SELECT) ); I am not sure exactly what the parentheses surrounding the scalar-sub-SELECTs do(turn them into anonymously typed rows?) but if the first scalar-sub-select results in a non-null result then the secondone should not be executed. Indeed, COALESCE will not execute the second sub-select at runtime, but that doesn't particularly matter here. What matters is that "ps3(2)" qualifies to be pre-evaluated (folded to a constant) at plan time. So that happens, and the RAISE message comes out, at plan time. What's left at run time is SELECT COALESCE( (SELECT 1), (SELECT 2) ); and indeed the "SELECT 2" is skipped at that point, as is visible in the EXPLAIN ANALYZE measurements. regards, tom lane
> -----Original Message----- > From: Adrian Klaver [mailto:adrian.klaver@gmail.com] > Sent: Friday, December 21, 2012 10:57 AM > To: David Johnston > Cc: 'jg'; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Coalesce bug ? > > On 12/21/2012 07:49 AM, David Johnston wrote: > >> -----Original Message----- > > >> > >> In the first case the nested parentheses mean the SELECT statements > >> are run first and the COALESCE is run on the return results. > > > > The first case is: > > > > SELECT COALESCE( (SELECT), (SELECT) ); I am not sure exactly what the > parentheses surrounding the scalar-sub-SELECTs do (turn them into > anonymously typed rows?) but if the first scalar-sub-select results in a non- > null result then the second one should not be executed. Also, the Query > Plan shown "never executed" the second scalar-sub-SELECT (from the same > query form) yet it knows that the result of the second call was "OUTPUT: 2" > > > >> > >> In the second case COALESCE is working as advertised. Working left to > >> right it finds the first non NULL argument and stops. > >> > > > > I thought that in order to call the Coalesce function the system would have > to know the value of all parameters. There is no lazy instantiation in SQL. > Both "SELECT" statements because they have to be run before the COALESCE > function call be evaluated. Whether the ps3(?) function has to be executed > then only depends on whether enough information exists in memory to > optimize the call away. > > I am just going by the docs and what my observations have been:) > > http://www.postgresql.org/docs/9.2/interactive/functions- > conditional.html#FUNCTIONS-COALESCE-NVL-IFNULL > > "Like a CASE expression, COALESCE only evaluates the arguments that are > needed to determine the result; that is, arguments to the right of the first > non-null argument are not evaluated. " > Which is how I thought things to work but per that the following should be equivalent: select coalesce( (select ps3(1)), (SELECT ps3(2)) ); select coalesce( ps3(1), ps3(2) ); Neither of these should EVER result in the "ps3(2)" function call being evaluated...regardless of the mutability modifier. I guess the addition of SELECT and/or () to the first expression is having an impact but I have no idea where to even lookin the documentation for where that difference would be defined. David J.
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Friday, December 21, 2012 11:16 AM > To: David Johnston > Cc: 'Adrian Klaver'; 'jg'; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Coalesce bug ? > > "David Johnston" <polobo@yahoo.com> writes: > > The first case is: > > > SELECT COALESCE( (SELECT), (SELECT) ); I am not sure exactly what the > parentheses surrounding the scalar-sub-SELECTs do (turn them into > anonymously typed rows?) but if the first scalar-sub-select results in a non- > null result then the second one should not be executed. > > Indeed, COALESCE will not execute the second sub-select at runtime, but > that doesn't particularly matter here. What matters is that "ps3(2)" > qualifies to be pre-evaluated (folded to a constant) at plan time. So that > happens, and the RAISE message comes out, at plan time. What's left at run > time is > > SELECT COALESCE( (SELECT 1), (SELECT 2) ); > > and indeed the "SELECT 2" is skipped at that point, as is visible in the EXPLAIN > ANALYZE measurements. > > regards, tom lane Understood (I'm guessing there is no "global" cache but simply the plan-level cache that gets populated each time?) However, in the following example the ps3(2) expression should also qualify for this "folding" and thus the RAISE NOTICE should also appear during plan time for the same reason; which, per the OP, it does not. pgb=# select coalesce( ps3(1), ps3(2) ); WARNING: Call ps3(1)=1 coalesce ---------- 1 (1 row) It would seem the addition of the sub-select messes with the COALESCE logic. If the function call is directly a part of the COALESCE statement it can be optimized away by the COALESCE logic but if it is buried within a SELECT statement the planner does not know that the function is indirectly part of a COALESCE input set and so it goes ahead and performs its optimization but pre-executing the function and caching its results. David J.
> Something to consider: Since you've told Postgres that your function is > immutable, it might be remembering the result from the first execution and > using it in the second. Try restarting the server between the EXPLAIN and the > test. > Now that you've said this as well (hopefully you are not just repeating me)... Is this even possible or does an IMMUTABLE function has get evaluated at least one time per query? From Tom's comments I am assuming that, per plan, at minimum the planner will execute a function with constant literals one time and simply reuse the results. It does not sound as if a function with "field sourced values" will ever be "cached". Futhermore, even if a function is always called with the same constants the caching mechanism does not cross the statement boundary (not even to the transaction level let alone global). > By the way, why do you declare your functions as "STRICT IMMUTABLE" > and "STRICT VOLATILE"? Is this a question about the layout of the commands spatially? David J.
"David Johnston" <polobo@yahoo.com> writes: >> Indeed, COALESCE will not execute the second sub-select at runtime, but >> that doesn't particularly matter here. What matters is that "ps3(2)" >> qualifies to be pre-evaluated (folded to a constant) at plan time. > Understood (I'm guessing there is no "global" cache but simply the > plan-level cache that gets populated each time?) There's no cache. Either the function gets evaluated, or it doesn't. > However, in the following example the ps3(2) expression should also qualify > for this "folding" and thus the RAISE NOTICE should also appear during plan > time for the same reason; which, per the OP, it does not. > pgb=# select coalesce( ps3(1), ps3(2) ); Ah. The reason for that is a bit subtle: constant-folding of immutable functions happens in the same pass over the query tree as simplification of simplifiable constructs --- including COALESCE. So what's happening is that eval_const_expressions, working on the COALESCE construct, first calls itself recursively to simplify the first argument. That leads to evaluation of ps3(1), and we get back a constant 1. Now we reach a block of code with this comment: /* * We can remove null constants from the list. For a * non-null constant, if it has not been preceded by any * other non-null-constant expressions then it is the * result. Otherwise, it's the next argument, but we can * drop following arguments since they will never be * reached. */ So at this point we realize that the result of the COALESCE() is 1, and we don't bother to do const-simplification of its remaining arguments. They're just thrown away, and the final command for execution is nothing more than "SELECT 1" (as you can see if you do EXPLAIN VERBOSE). The other example with sub-SELECTs acts differently because the sub-SELECT is something of an optimization fence --- "(SELECT 1)" does not look like a simple Const to eval_const_expressions. As you noted upthread, none of this is a bug. Labeling a function immutable is an explicit statement that it has no side-effects of interest and can be evaluated whenever the system chooses. If you stick in side-effects like a RAISE statement, then that lets you peer into some inner workings of the optimizer, but it's you that's breaking the rules not the optimizer. regards, tom lane
On Sat, Dec 22, 2012 at 3:53 AM, David Johnston <polobo@yahoo.com> wrote: > Chris Angelico wrote, and David dropped the citation (oops!): >> By the way, why do you declare your functions as "STRICT IMMUTABLE" >> and "STRICT VOLATILE"? > > Is this a question about the layout of the commands spatially? As I understand it, there are three keywords: VOLATILE, STRICT, and IMMUTABLE. Putting one of those keywords into the declaration flags the function accordingly; if none is given, VOLATILE is assumed. Declaring the function as "STRICT IMMUTABLE" or "STRICT VOLATILE" seems at best redundant (the word "STRICT" is mere noise), and at worst confusing (the parser might be interpreting it as "STRICT", not as the other keyword). Or is there something I'm missing here? ChrisA
On 12/21/2012 02:22 PM, Chris Angelico wrote: > On Sat, Dec 22, 2012 at 3:53 AM, David Johnston <polobo@yahoo.com> wrote: >> Chris Angelico wrote, and David dropped the citation (oops!): >>> By the way, why do you declare your functions as "STRICT IMMUTABLE" >>> and "STRICT VOLATILE"? >> >> Is this a question about the layout of the commands spatially? > > As I understand it, there are three keywords: VOLATILE, STRICT, and > IMMUTABLE. Putting one of those keywords into the declaration flags > the function accordingly; if none is given, VOLATILE is assumed. > > Declaring the function as "STRICT IMMUTABLE" or "STRICT VOLATILE" > seems at best redundant (the word "STRICT" is mere noise), and at > worst confusing (the parser might be interpreting it as "STRICT", not > as the other keyword). Or is there something I'm missing here? Not sure I am following. IMMUTABLE/VOLATILE/STABLE deal with the query optimizer. CALLED ON NULL INPUT/STRICT deal with NULL input behavior. http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html > > ChrisA > > -- Adrian Klaver adrian.klaver@gmail.com
On Sat, Dec 22, 2012 at 9:31 AM, Adrian Klaver <adrian.klaver@gmail.com> wrote: > On 12/21/2012 02:22 PM, Chris Angelico wrote: >> As I understand it, there are three keywords: VOLATILE, STRICT, and >> IMMUTABLE. Putting one of those keywords into the declaration flags >> the function accordingly; if none is given, VOLATILE is assumed. >> >> Declaring the function as "STRICT IMMUTABLE" or "STRICT VOLATILE" >> seems at best redundant (the word "STRICT" is mere noise), and at >> worst confusing (the parser might be interpreting it as "STRICT", not >> as the other keyword). Or is there something I'm missing here? > > > Not sure I am following. > > IMMUTABLE/VOLATILE/STABLE deal with the query optimizer. > > CALLED ON NULL INPUT/STRICT deal with NULL input behavior. > > http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html OOPS! Error's mine, sorry everyone! I thought the keyword was STRICT, but it's actually STABLE that does that. Apologies to anyone who's been confused by my miswording in this thread! STRICT VOLATILE makes perfect sense; STABLE VOLATILE is what doesn't. ChrisA
Hi, > Ah. The reason for that is a bit subtle: constant-folding of immutable > functions happens in the same pass over the query tree as simplification > of simplifiable constructs --- including COALESCE. So what's happening > is that eval_const_expressions, working on the COALESCE construct, first > calls itself recursively to simplify the first argument. That leads to > evaluation of ps3(1), and we get back a constant 1. Now we reach a > block of code with this comment: > > /* > * We can remove null constants from the list. For a > * non-null constant, if it has not been preceded by any > * other non-null-constant expressions then it is the > * result. Otherwise, it's the next argument, but we can > * drop following arguments since they will never be > * reached. > */ > > So at this point we realize that the result of the COALESCE() is 1, and > we don't bother to do const-simplification of its remaining arguments. > They're just thrown away, and the final command for execution is nothing > more than "SELECT 1" (as you can see if you do EXPLAIN VERBOSE). > > The other example with sub-SELECTs acts differently because the > sub-SELECT is something of an optimization fence --- "(SELECT 1)" does > not look like a simple Const to eval_const_expressions. > > As you noted upthread, none of this is a bug. Labeling a function > immutable is an explicit statement that it has no side-effects of > interest and can be evaluated whenever the system chooses. If you stick > in side-effects like a RAISE statement, then that lets you peer into > some inner workings of the optimizer, but it's you that's breaking the > rules not the optimizer. > > regards, tom lane Thank you for the explanation. It was tricky to get it, and I got this question as a side effect of some other optimisation works. -- Cordialement, Jean-Gérard Pailloncy