Thread: Repeat execution of stable expressions
I've complained many times that select (f()).*; will execute f() once for each returned field of f() since the server essentially expands that into: select f().a, f().b; try it yourself, see: create function f(a out text, b out text) returns record as $$ begin perform pg_sleep(1); a := 'a'; b := 'b'; end; $$ language plpgsql immutable; If f returns a,b etc. This is true if the function f() is marked stable or immutable. That it does this for immutable functions is pretty awful but it's the stable case that I find much more interesting -- most non-trivial functions that read from the database are stable. Shouldn't the server be able to detect that function only needs to be run once? By the way, this isn't just happening with function calls. I've noticed the same behavior in queries like this: create view v as select (select foo from foo where ...) as foo_1, (select foo from foo where ...) as foo_2, from complicated_query; that when you query from v, you can sometimes see exploding subplans such that when you pull a field from foo_1, it reruns the lookup on foo. So my question is this: Can stable functions and other similar query expressions be optimized so that they are not repeat evaluated like that without breaking anything? merlin
On Mon, Mar 5, 2012 at 3:15 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > I've complained many times that > select (f()).*; > > will execute f() once for each returned field of f() since the server > essentially expands that into: > > select f().a, f().b; > oh, this is why we expand rows inside a WITH statement. it should probably be fixed, but you should find something like WITH fn AS SELECT f(), SELECT (fn).a, (fn).b will make your life better -- Peter van Hardenberg San Francisco, California "Everything was beautiful, and nothing hurt." -- Kurt Vonnegut
On Mon, Mar 5, 2012 at 6:41 PM, Peter van Hardenberg <pvh@pvh.ca> wrote: > On Mon, Mar 5, 2012 at 3:15 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> I've complained many times that >> select (f()).*; >> >> will execute f() once for each returned field of f() since the server >> essentially expands that into: >> >> select f().a, f().b; >> > > oh, this is why we expand rows inside a WITH statement. > > it should probably be fixed, but you should find something like > > WITH fn AS SELECT f(), > SELECT (fn).a, (fn).b > > will make your life better sure, but WITH is an optimization fence. I use a lot of views, and if you wrap your view with WITH, then your quals won't get pushed through. ditto if you use the 'OFFSET 0' hack to keep the subquery from being flattened out. merlin
hi, > I've complained many times that > select (f()).*; > > will execute f() once for each returned field of f() since the server > essentially expands that into: > > select f().a, f().b; > > try it yourself, see: > create function f(a out text, b out text) returns record as $$ > begin > perform pg_sleep(1); > a := 'a'; b := 'b'; end; > $$ language plpgsql immutable; i ran into this regularly too. when f() is expensive then i try to rewrite the query so that the function only get called once per row. # explain analyze select (f()).*; QUERY PLAN ------------------------------------------------------------------------------------------ Result (cost=0.00..0.51 rows=1 width=0) (actual time=2001.116..2001.117 rows=1 loops=1) Total runtime: 2001.123 ms # explain analyze select f.* from f() as f; QUERY PLAN ------------------------------------------------------------------------------------------------------- Function Scan on f (cost=0.25..0.26 rows=1 width=64) (actual time=1000.928..1000.928 rows=1 loops=1) Total runtime: 1000.937 ms regards, jan
On Tue, Mar 6, 2012 at 10:21 AM, Jan Otto <asche@me.com> wrote: > hi, > >> I've complained many times that >> select (f()).*; >> >> will execute f() once for each returned field of f() since the server >> essentially expands that into: >> >> select f().a, f().b; >> >> try it yourself, see: >> create function f(a out text, b out text) returns record as $$ >> begin >> perform pg_sleep(1); >> a := 'a'; b := 'b'; end; >> $$ language plpgsql immutable; > > > i ran into this regularly too. when f() is expensive then i try to rewrite the query so that the > function only get called once per row. > > # explain analyze select (f()).*; > QUERY PLAN > ------------------------------------------------------------------------------------------ > Result (cost=0.00..0.51 rows=1 width=0) (actual time=2001.116..2001.117 rows=1 loops=1) > Total runtime: 2001.123 ms > > # explain analyze select f.* from f() as f; > QUERY PLAN > ------------------------------------------------------------------------------------------------------- > Function Scan on f (cost=0.25..0.26 rows=1 width=64) (actual time=1000.928..1000.928 rows=1 loops=1) > Total runtime: 1000.937 ms yeah -- that's pretty neat, but doesn't seem fit a lot of the cases I bump into. In particular, when stuffing composite types in the field list. You need the type to come back as a scalar so you can expand it a wrapper (especially when layering views). merlin