Thread: Important speed difference between a query and a function with the same query
Important speed difference between a query and a function with the same query
From
Frederic Jolliton
Date:
Hi, (PostgreSQL 7.3.2 on i386-portbld-freebsd4.7, compiled by GCC 2.95.4) I've a curious performance problem with a function returning set of rows. The query alone is very fast, but not when called from the function. To "emulate" a parametred view¹, I created a function as follow: CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' LANGUAGE sql; The table table1 have 330K rows, and table2 have 3K rows. When I run the following query (prefixed with SELECT * to try to get the same behavior that the second query), I obtain very good time. database=# SELECT * FROM ( (SELECT a.field1,a.field2,a.field3,b.field3,b.field4,a.field5 FROM table1 AS a, table1 AS b WHERE a.field6=b.field4 ORDER BY a.field6 DESC LIMIT 10) UNION (SELECT a.field1,a.field2,b.field3,a.field3,a.field4,b.field5 FROM table2 AS a, table1 AS b WHERE a.field4=b.field6 ORDER BY a.field4 DESC LIMIT 10) ORDER BY field4 DESC LIMIT 10 ) AS type_get_info; [...] (10 rows) Time: 185.86 ms But, when I run the function (with 10 as parameter, but even 1 is slow) I get poor time: database=# SELECT * FROM get_info(10); [...] (10 rows) Time: 32782.26 ms database=# (even after a VACUUM FULL ANALYZE, and REINDEX of indexes used in the queries) What is curious is that I remember that the function was fast at a time.. What is the difference between the two case ? [1] Is there another solution to this 'hack' ? I can't simply create a view and use 'LIMIT 10' because intermediate SELECT have be limited too (to avoid UNION with 300K rows where only the first 10 are of interest to me.) -- Frédéric Jolliton
> (PostgreSQL 7.3.2 on i386-portbld-freebsd4.7, compiled by GCC 2.95.4) > > I've a curious performance problem with a function returning set of > rows. The query alone is very fast, but not when called from the > function. > > To "emulate" a parametred view, I created a function as follow: > > CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info > AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' > LANGUAGE sql; Setting enable_seqscan to off give same result speed between the query and the function ! So, the query in the function is not using index but the exact same query alone does ! Is there an explanation ? -- Frédéric Jolliton
Frederic Jolliton <fred-pg@jolliton.com> writes: >> To "emulate" a parametred view, I created a function as follow: >> >> CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info >> AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' >> LANGUAGE sql; > So, the query in the function is not using index but the exact same > query alone does ! But it's not the same query, is it? With "LIMIT $1" the planner can't know what the limit value is exactly, so it has to generate a plan that won't be too unreasonable for either a small or a large limit. regards, tom lane
On Thu, 24 Apr 2003, Frederic Jolliton wrote: > > (PostgreSQL 7.3.2 on i386-portbld-freebsd4.7, compiled by GCC 2.95.4) > > > > I've a curious performance problem with a function returning set of > > rows. The query alone is very fast, but not when called from the > > function. > > > > To "emulate" a parametred view, I created a function as follow: > > > > CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info > > AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' > > LANGUAGE sql; > > Setting enable_seqscan to off give same result speed between the query > and the function ! > > So, the query in the function is not using index but the exact same > query alone does ! > > Is there an explanation ? My guess is that limit $1 is assuming a larger number of rows when planning the queries, large enough that it expects seqscan to be better (assuming the limit is what it expects). It's probably not going to plan that query each time the function is called so it's not going to know whether you're calling with a small number (index scan may be better) or a large number (seq scan may be better). For example, if you sent 100000, the index scan might be a loser. Perhaps plpgsql with EXECUTE would work better for that, although it's likely to have some general overhead.
> Frederic Jolliton <fred-pg@jolliton.com> writes: >>> To "emulate" a parametred view, I created a function as follow: >>> >>> CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info >>> AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' >>> LANGUAGE sql; > >> So, the query in the function is not using index but the exact same >> query alone does ! Tom Lane <tgl@sss.pgh.pa.us> writes: > But it's not the same query, is it? With "LIMIT $1" the planner can't > know what the limit value is exactly, so it has to generate a plan that > won't be too unreasonable for either a small or a large limit. Ok. So the query is optimized once and not each time.. I understand now. But, since I "know" better that PostgreSQL that query must use index in most of case, can I force in some manner the function when declaring it to take this in account ? I suppose (not tested) that setting enable_seqscan just before will probably do it, but what about dump/restore of the database when recreating the function and keep this "fix" automatically ?
> On Thu, 24 Apr 2003, Frederic Jolliton wrote: >> > CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info >> > AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' >> > LANGUAGE sql; >> >> Setting enable_seqscan to off give same result speed between the query >> and the function ! >> >> So, the query in the function is not using index but the exact same >> query alone does ! >> >> Is there an explanation ? Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > My guess is that limit $1 is assuming a larger number of rows when > planning the queries, large enough that it expects seqscan to be > better (assuming the limit is what it expects). It's probably not > going to plan that query each time the function is called so it's > not going to know whether you're calling with a small number (index > scan may be better) or a large number (seq scan may be better). For > example, if you sent 100000, the index scan might be a loser. > > Perhaps plpgsql with EXECUTE would work better for that, although > it's likely to have some general overhead. The server is rather fast, and the query return 10 to 50 rows in most case. So, this is probably a solution, even if it's not very clean. (Well, I have to find an example to RETURN the result of EXECUTE..) But, what I don't understand is why enable_seqscan change something if the query is already planed. -- Frédéric Jolliton
On Thu, 24 Apr 2003, Frederic Jolliton wrote: > > On Thu, 24 Apr 2003, Frederic Jolliton wrote: > > Perhaps plpgsql with EXECUTE would work better for that, although > > it's likely to have some general overhead. > > The server is rather fast, and the query return 10 to 50 rows in most > case. So, this is probably a solution, even if it's not very > clean. (Well, I have to find an example to RETURN the result of > EXECUTE..) Check out http://techdocs.postgresql.org/guides/SetReturningFunctions specifically the GetRows() function for an example of using for in execute with set returning functions.
> On Thu, 24 Apr 2003, Frederic Jolliton wrote: [...] >> The server is rather fast, and the query return 10 to 50 rows in >> most case. So, this is probably a solution, even if it's not very >> clean. (Well, I have to find an example to RETURN the result of >> EXECUTE..) Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > Check out > http://techdocs.postgresql.org/guides/SetReturningFunctions > > specifically the GetRows() function for an example of using for in > execute with set returning functions. Oh right. Thanks you for pointing out this. -- Frédéric Jolliton
Frederic Jolliton kirjutas N, 24.04.2003 kell 19:33: > > Frederic Jolliton <fred-pg@jolliton.com> writes: > >>> To "emulate" a parametred view, I created a function as follow: > >>> > >>> CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info > >>> AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' > >>> LANGUAGE sql; > > > >> So, the query in the function is not using index but the exact same > >> query alone does ! > > Tom Lane <tgl@sss.pgh.pa.us> writes: > > But it's not the same query, is it? With "LIMIT $1" the planner can't > > know what the limit value is exactly, so it has to generate a plan that > > won't be too unreasonable for either a small or a large limit. > > Ok. So the query is optimized once and not each time.. I understand > now. > > But, since I "know" better that PostgreSQL that query must use index > in most of case, can I force in some manner the function when > declaring it to take this in account ? You could define two functions - one for small sets with constant LIMITs (maybe 50) in UNION parts, and another with $1. Then use accordingly. > I suppose (not tested) that > setting enable_seqscan just before will probably do it, but what about > dump/restore of the database when recreating the function and keep > this "fix" automatically ? ------------- Hannu
Re: Important speed difference between a query and a function with the same query
From
"Jim C. Nasby"
Date:
On Wed, Apr 23, 2003 at 07:53:55PM +0200, Frederic Jolliton wrote: > CREATE FUNCTION get_info (integer) RETURNS SETOF type_get_info > AS '...' <- here the query show below, where 'LIMIT $1' is used instead of 'LIMIT 10' > LANGUAGE sql; You should probably define the function to be STABLE. LANGUAGE sql STABLE; See http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=sql-createfunction.html for more info. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"