Thread: Calculated view fields (8.1 != 8.2)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, I'm observing this behavior in 8.2: kalman=# create table test ( a integer, b integer ); CREATE TABLE kalman=# kalman=# CREATE OR REPLACE FUNCTION sp_test ( INTEGER ) kalman-# RETURNS INTEGER AS' kalman'# DECLARE kalman'# a_id_contract ALIAS FOR $1; kalman'# BEGIN kalman'# RAISE NOTICE ''here''; kalman'# kalman'# return 3; kalman'# END; kalman'# ' LANGUAGE 'plpgsql'; CREATE FUNCTION kalman=# kalman=# kalman=# CREATE OR REPLACE VIEW v_test AS kalman-# SELECT kalman-# a AS a, kalman-# sp_test(b) AS b kalman-# FROM kalman-# test c kalman-# ; CREATE VIEW kalman=# kalman=# insert into test values (2,3); INSERT 0 1 kalman=# kalman=# select * from v_test; NOTICE: herea | b - ---+---2 | 3 (1 row) kalman=# select a from v_test; NOTICE: herea - ---2 (1 row) In version 8.1 the function sp_test is not evaluated in case of "select a from v_test". If I mark the function as STABLE or IMMUTABLE then even with version 8.2 the function is not evaluated. Is this the intended behavior? I didn't see something about it in the release note. Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF7Y1D7UpzwH2SGd4RAv//AJ0dcDPyYIndVMs7pEhzXjVNwKqdLQCeJQnL oaZVL2JgS/J9lPf+B80+FuY= =qaCE -----END PGP SIGNATURE-----
Gaetano Mendola <mendola@bigfoot.com> writes: > [ 8.2 evaluates volatile functions in the targetlist of a view ] > If I mark the function as STABLE or IMMUTABLE then even with version > 8.2 the function is not evaluated. Is this the intended behavior? Yes; people complained that we needed to be more careful about the number of times volatile functions get evaluated. > I didn't see something about it in the release note. <listitem> <para> Do not flatten subqueries that contain <literal>volatile</> functions in their targetlists (Jaime Casanova) </para> <para> This prevents surprising behavior due to multiple evaluation of a <literal>volatile</> function(such as <function>random()</> or <function>nextval()</>). It might cause performance degradation inthe presence of functions that are unnecessarily marked as <literal>volatile</>. </para> </listitem> regards, tom lane
Tom Lane wrote: > Gaetano Mendola <mendola@bigfoot.com> writes: >> [ 8.2 evaluates volatile functions in the targetlist of a view ] >> If I mark the function as STABLE or IMMUTABLE then even with version >> 8.2 the function is not evaluated. Is this the intended behavior? > > Yes; people complained that we needed to be more careful about the > number of times volatile functions get evaluated. I suspect that functions are evaluated also for record discarded due to joins. Is that the case? Like: SELECT * FROM ta JOIN tb USING (id) where tb.foo = 4; If ta is a view with some calculated fields are the function on ta evaluated only for record matching the filters or in some case ( like a full scan on ta ) also for the records discarded due to the join? Regards Gaetano Mendola
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Gaetano Mendola wrote: > Tom Lane wrote: >> Gaetano Mendola <mendola@bigfoot.com> writes: >>> [ 8.2 evaluates volatile functions in the targetlist of a view ] >>> If I mark the function as STABLE or IMMUTABLE then even with version >>> 8.2 the function is not evaluated. Is this the intended behavior? >> Yes; people complained that we needed to be more careful about the >> number of times volatile functions get evaluated. > > I suspect that functions are evaluated also for record discarded due to > joins. Is that the case? > > Like: > > SELECT * FROM ta JOIN tb USING (id) where tb.foo = 4; > > If ta is a view with some calculated fields are the function on ta > evaluated only for record matching the filters or in some case ( > like a full scan on ta ) also for the records discarded due to the join? > I did a check on a 8.2 and I can confirm my suspects: kalman=# create table ta ( a integer, b integer ); CREATE TABLE kalman=# create table tb ( b integer, c integer ); CREATE TABLE kalman=# kalman=# CREATE OR REPLACE FUNCTION sp_test ( INTEGER ) kalman-# RETURNS INTEGER AS' kalman'# DECLARE kalman'# a_id ALIAS FOR $1; kalman'# BEGIN kalman'# RAISE NOTICE ''here''; kalman'# kalman'# return 3; kalman'# END; kalman'# ' LANGUAGE 'plpgsql'; CREATE FUNCTION kalman=# kalman=# CREATE OR REPLACE VIEW v_ta AS kalman-# SELECT kalman-# sp_test(a) AS a, kalman-# b AS b kalman-# FROM kalman-# ta c kalman-# ; CREATE VIEW kalman=# kalman=# insert into ta values (2,3); INSERT 0 1 kalman=# insert into ta values (3,4); INSERT 0 1 kalman=# insert into tb values (4,5); INSERT 0 1 kalman=# kalman=# select * from v_ta join tb using (b) where c = 5; NOTICE: here NOTICE: hereb | a | c - ---+---+---4 | 3 | 5 (1 row) Is really this what we want? I did a migration 8.0.x => 8.2.3 and I had on first hour of service up lot of queries "blocked" due to this, consider in my case I have on v_ta milions of records and usually that join extracts 1 row. Is there a way to set till I don't check all my huge schema to disable this behaviour? Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF8TAc7UpzwH2SGd4RAgajAKCvIxLH9JSBk4gxSbuaq4WE2y7v2wCfbnRa jWDV3hlEq/Loye6G+E2S9Ew= =LR5T -----END PGP SIGNATURE-----
On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: > Is really this what we want? I did a migration 8.0.x => 8.2.3 and I had on first hour of service up > lot of queries "blocked" due to this, consider in my case I have on v_ta milions of records and usually > that join extracts 1 row. Is there a way to set till I don't check all my huge schema to disable this > behaviour? Most people figured it was a improvment. It's configured per function now, which wasn't the case before. I dont't think there was ever any discussion about having a global switch. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout <kleptog@svana.org> writes: > Most people figured it was a improvment. It's configured per function > now, which wasn't the case before. I dont't think there was ever any > discussion about having a global switch. Volatile functions that are not at the top level of a query are *always* going to be a risk factor, in that you don't know quite where the planner is going to evaluate them. While I'm not by any means wedded to the 8.2 no-flattening patch, it seems to me to be reasonable because it reduces that uncertainty a bit. The fact that Gaetano's code depended on the uncertainty being resolved in a different direction is unfortunate, but I think his code is really to blame, because postponing the function eval like that couldn't be guaranteed anyway across all queries. regards, tom lane
Martijn van Oosterhout wrote: > On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: >> Is really this what we want? I did a migration 8.0.x => 8.2.3 and I had on first hour of service up >> lot of queries "blocked" due to this, consider in my case I have on v_ta milions of records and usually >> that join extracts 1 row. Is there a way to set till I don't check all my huge schema to disable this >> behaviour? > > Most people figured it was a improvment. It's configured per function > now, which wasn't the case before. I dont't think there was ever any > discussion about having a global switch. If your function is already marked immutable or stable, then nothing changes for you. If you *did* call volatile functions inside your select, then you now get consistens behaviour. Since you don't want your function to be evaluated in all cases, I assume that it shouldn't be marked volatile in the first place. I think a lot of people forget to mark their functions volatile/stable/immutable correctly, or don't know about the implications of these flags. Maybe there should be a guc force_explicit_sideeffeect_declaration (defaulting to on) which makes specifying either volatile, stable or immutable mandatory. Then people would (hopefully) read the relevant part of the docs before creating a function, and probably get the declaration right in the first place. greetings, Florian Pflug
Martijn van Oosterhout wrote: > On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: >> Is really this what we want? I did a migration 8.0.x => 8.2.3 and I had on first hour of service up >> lot of queries "blocked" due to this, consider in my case I have on v_ta milions of records and usually >> that join extracts 1 row. Is there a way to set till I don't check all my huge schema to disable this >> behaviour? > > Most people figured it was a improvment. It's configured per function > now, which wasn't the case before. I dont't think there was ever any > discussion about having a global switch. Well it's not an improvement in term of performances but a performance degradation in the best case and in the worst can be devastating: create table ta ( a integer, b integer ); CREATE TABLE create table tb ( b integer, c integer ); CREATE TABLE CREATE OR REPLACE FUNCTION sp_delete_selected_row ( INTEGER ) RETURNS INTEGER AS' DECLARE a_id ALIAS FOR $1; BEGIN DELETE FROM ta where a = a_id; return 0; END; ' LANGUAGE 'plpgsql'; CREATE FUNCTION CREATE OR REPLACE VIEW v_ta AS SELECT sp_delete_selected_row(a) AS a, b AS b FROM ta ; CREATE VIEW insert into ta values (2,3); INSERT 0 1 insert into ta values (3,4); INSERT 0 1 insert into tb values (4,5); INSERT 0 1 select * from v_ta join tb using (b) where c = 5; b | a | c ---+---+---4 | 0 | 5 (1 row) select * from ta;a | b ---+--- (0 rows) All rows are gone instead of the only one extracted from that query. IMHO is a undesired side effect. In my case I destroyed my application statistics on how many time a certain row was extracted. Regards Gaetano Mendola
Gaetano Mendola wrote: > Martijn van Oosterhout wrote: >> On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: >>> Is really this what we want? I did a migration 8.0.x => 8.2.3 and I had on first hour of service up >>> lot of queries "blocked" due to this, consider in my case I have on v_ta milions of records and usually >>> that join extracts 1 row. Is there a way to set till I don't check all my huge schema to disable this >>> behaviour? >> Most people figured it was a improvment. It's configured per function >> now, which wasn't the case before. I dont't think there was ever any >> discussion about having a global switch. > > Well it's not an improvement in term of performances but a performance degradation in the best case and > in the worst can be devastating: > > create table ta ( a integer, b integer ); > CREATE TABLE > create table tb ( b integer, c integer ); > CREATE TABLE > > CREATE OR REPLACE FUNCTION sp_delete_selected_row ( INTEGER ) > RETURNS INTEGER AS' > DECLARE > a_id ALIAS FOR $1; > BEGIN > DELETE FROM ta where a = a_id; > return 0; > END; > ' LANGUAGE 'plpgsql'; > CREATE FUNCTION > > CREATE OR REPLACE VIEW v_ta AS > SELECT > sp_delete_selected_row(a) AS a, > b AS b > FROM > ta > ; > CREATE VIEW > > insert into ta values (2,3); > INSERT 0 1 > insert into ta values (3,4); > INSERT 0 1 > insert into tb values (4,5); > INSERT 0 1 > > select * from v_ta join tb using (b) where c = 5; > > b | a | c > ---+---+--- > 4 | 0 | 5 > (1 row) > > select * from ta; > a | b > ---+--- > (0 rows) > > > All rows are gone instead of the only one extracted from that query. IMHO is a undesired side effect. > In my case I destroyed my application statistics on how many time a certain row was extracted. This is insane. Whoever creates a view like that on a production system should *immediatly* be carried away from his keyboard, to prevent further damage. Imagine someone using "View Data" on this view in pgadmin.. I don't wanna be near him when he clicks "Refresh", and suddenly all data is gone... Maybe calling volatile functions in selects and views should be forbidden entirely, except for volatile functions in the top-level select clause, to support things like "select ..., nextval('seq') from ...". But it's probably not worth the effort - there will always be creative ways to shoot yourself into your foot. greetings, Florian Pflug
Florian G. Pflug wrote: > Gaetano Mendola wrote: >> Martijn van Oosterhout wrote: >>> On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: >>>> Is really this what we want? I did a migration 8.0.x => 8.2.3 and I >>>> had on first hour of service up >>>> lot of queries "blocked" due to this, consider in my case I have on >>>> v_ta milions of records and usually >>>> that join extracts 1 row. Is there a way to set till I don't check >>>> all my huge schema to disable this >>>> behaviour? >>> Most people figured it was a improvment. It's configured per function >>> now, which wasn't the case before. I dont't think there was ever any >>> discussion about having a global switch. >> >> Well it's not an improvement in term of performances but a performance >> degradation in the best case and >> in the worst can be devastating: >> >> create table ta ( a integer, b integer ); >> CREATE TABLE >> create table tb ( b integer, c integer ); >> CREATE TABLE >> >> CREATE OR REPLACE FUNCTION sp_delete_selected_row ( INTEGER ) >> RETURNS INTEGER AS' >> DECLARE >> a_id ALIAS FOR $1; >> BEGIN >> DELETE FROM ta where a = a_id; >> return 0; >> END; >> ' LANGUAGE 'plpgsql'; >> CREATE FUNCTION >> >> CREATE OR REPLACE VIEW v_ta AS >> SELECT >> sp_delete_selected_row(a) AS a, >> b AS b >> FROM >> ta >> ; >> CREATE VIEW >> >> insert into ta values (2,3); >> INSERT 0 1 >> insert into ta values (3,4); >> INSERT 0 1 >> insert into tb values (4,5); >> INSERT 0 1 >> >> select * from v_ta join tb using (b) where c = 5; >> >> b | a | c >> ---+---+--- >> 4 | 0 | 5 >> (1 row) >> >> select * from ta; >> a | b >> ---+--- >> (0 rows) >> >> >> All rows are gone instead of the only one extracted from that query. >> IMHO is a undesired side effect. >> In my case I destroyed my application statistics on how many time a >> certain row was extracted. > > This is insane. Whoever creates a view like that on a production system > should *immediatly* be carried away from his keyboard, to prevent > further damage. Imagine someone using "View Data" on this view in > pgadmin.. I don't wanna be near him when he clicks "Refresh", and > suddenly all data is gone... > > Maybe calling volatile functions in selects and views should be > forbidden entirely, except for volatile functions in the top-level > select clause, > to support things like "select ..., nextval('seq') from ...". > > But it's probably not worth the effort - there will always be creative > ways to shoot yourself into your foot. I full agree with this, that was just an extreme example of an hidden undesired call. In my framework I don't have by "coding rule" any function with side effects applied at view fields, however I have some functions not marked correctly as STABLE ( mea culpa ) that degraded the performances until I realized what was going on; I'm in the opinion that is not sane call a function not marked as stable/immutable for discarded column (I can in some way accept this ) and most of all on discarded rows. Regards Gaetano Mendola
Florian G. Pflug wrote: > Martijn van Oosterhout wrote: >> On Fri, Mar 09, 2007 at 10:59:56AM +0100, Gaetano Mendola wrote: >>> Is really this what we want? I did a migration 8.0.x => 8.2.3 and I >>> had on first hour of service up >>> lot of queries "blocked" due to this, consider in my case I have on >>> v_ta milions of records and usually >>> that join extracts 1 row. Is there a way to set till I don't check >>> all my huge schema to disable this >>> behaviour? >> >> Most people figured it was a improvment. It's configured per function >> now, which wasn't the case before. I dont't think there was ever any >> discussion about having a global switch. > > If your function is already marked immutable or stable, then nothing > changes for you. If you *did* call volatile functions inside your > select, then you now get > consistens behaviour. Since you don't want your function to be evaluated > in all > cases, I assume that it shouldn't be marked volatile in the first place. Well some function are volatile and can not be marked as stable. We develop our applications layering the modules, we didn't have any reason to forbid as "coding rule" to put function call on view in low level layers. After all views are there also to build up your schema layering the info. I can immagine a case when a lower module exports a view to upper layer stating the interface as list of fields: first_name, last_name, .... with an *hidden* field that is a function call that updates the statistics on how many time a given record was selected, then this technique can not be used anymore starting with 8.2.x. The above is not my case but it can be a possible scenario (I admit not a sane one ). Regards Gaetano Mendola
Gaetano Mendola <gmendola@mbigroup.it> writes: > I can immagine a case when a lower module exports a view to upper layer stating > the interface as list of fields: > first_name, last_name, .... > with an *hidden* field that is a function call that updates the statistics on > how many time a given record was selected, then this technique can not be used > anymore starting with 8.2.x. You're living in a dream world if you think that works reliably in *any* version of Postgres. But for starters, what is your definition of "selected" --- pulled from the physical table? Accumulated into an aggregate? Delivered as a recognizable row to the client? Delivered N times to the client due to joining N times to some other table? regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tom Lane wrote: > Gaetano Mendola <gmendola@mbigroup.it> writes: >> I can immagine a case when a lower module exports a view to upper layer stating >> the interface as list of fields: > >> first_name, last_name, .... > >> with an *hidden* field that is a function call that updates the statistics on >> how many time a given record was selected, then this technique can not be used >> anymore starting with 8.2.x. > > You're living in a dream world if you think that works reliably in *any* > version of Postgres. But for starters, what is your definition of > "selected" --- pulled from the physical table? Accumulated into an > aggregate? Delivered as a recognizable row to the client? Delivered N > times to the client due to joining N times to some other table? Well that was a not good example, I don't have any problem in mark from now on all my function as stable/immutable (the one I use on views) but still I believe is source of bad performance evaluate a function on rows discarded and at same time this break the principle of least surprise. Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF9U8R7UpzwH2SGd4RAhoGAKDSpUSQ3lGEdIdFWLwQjxoZXUAS1ACdGtht TZg9BKScbzGO0MzpHy0Gr80= =auwk -----END PGP SIGNATURE-----