Thread: Query very slow when in plpgsql function
Hi everyone. I am using postgresql 8.3.7 on Fedora Core 10. I have 1 table called evaluation which contains about 1 million records, and another called evaluationentry which contains about 9 million records. evaluationentry.veto and evaluation.relevancedt both have indexes on them. I have run ANALYZE against the tables to ensure I have stats. I execute query: SELECT T1.evaluationid, MIN(T2.evalresult) FROM sharemgr.evaluation T1 INNER JOIN sharemgr.evaluationentry T2 ON (T1.evaluationid = T2.evaluationid) WHERE T1.relevancedt BETWEEN CAST('2009-06-15 00:00:00' AS TIMESTAMP) AND CAST('2009-06-15 23:59:59' AS TIMESTAMP) AND T2.veto = 'Y' GROUP BY T1.evaluationid HAVING MIN(T2.evalresult) = 100 and it returns about 10 results (correctly) in about 4 seconds - which I am more than happy with given the underlying hardware and virtualization layer. QUERY PLAN: === HashAggregate (cost=197446.95..197454.58 rows=436 width=9) (actual time=386.877..387.193 rows=10 loops=1) Filter: (min(t2.evalresult) = 100::numeric) -> Nested Loop (cost=0.00..197423.83 rows=3082 width=9) (actual time=0.319..302.310 rows=4438 loops=1) -> Index Scan using evaluation_i3 on evaluation t1 (cost=0.00..249.97 rows=436 width=4) (actual time=0.130..12.633 rows=634 loops=1) Index Cond: ((relevancedt >= '2009-12-14 00:00:00'::timestamp without time zone) AND (relevancedt <= '2009-12-14 23:59:59'::timestamp without time zone)) -> Index Scan using evaluationentry_i1 on evaluationentry t2 (cost=0.00..440.57 rows=933 width=9) (actual time=0.031..0.172 rows=7 loops=634) Index Cond: (t2.evaluationid = t1.evaluationid) Filter: (t2.veto = 'Y'::bpchar) Total runtime: 387.669 ms" === I then made this query into a function so I can pass in the 2 timestamps, and return T1.evaluationid as a SETOF INT by doing FOR matchRecord IN same query as above LOOP RETURN NEXT matchRecord.evaluationid; END LOOP; And when I execute the function with the same parameters it takes well over 5 minutes to execute. It seems as though inside a function, the optimizer wants to tablescan my 8M row table. Is there a way that I can see the query plans that my functions are using? Any suggestions welcome... chris
On 1/01/2010 12:05 AM, Chris McDonald wrote: > FOR matchRecord IN > same query as above > LOOP > RETURN NEXT matchRecord.evaluationid; > END LOOP; > > And when I execute the function with the same parameters it takes well > over 5 minutes to execute. It's as if you PREPAREd the query once, and each time you run the function it gets EXECUTEd. The query plan is cached. Unfortunately, when PostgreSQL builds a prepared statement (or query in a function) it doesn't have knowledge of exact parameter values, which limit its use of statistics for query optimisation. Currently there is no way to ask PostgreSQL to re-plan such queries at each execution. You have to force it by using a query that cannot be cached. In PL/PgSQL the usual method is to use EXECUTE ... USING to provide the query as text that is parsed and executed each time the function gets invoked. > It seems as though inside a function, the optimizer wants to tablescan > my 8M row table. Is there a way that I can see the query plans that my > functions are using? Not directly. However, if you PREPARE your query, then EXPLAIN ANALYZE EXECUTE it with the parameters you use, you'll see the same effects. (Hmm, this needs to be a FAQ) -- Craig Ringer
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > And when I execute the function with the same parameters it takes well > over 5 minutes to execute. > > It seems as though inside a function, the optimizer wants to tablescan > my 8M row table. Is there a way that I can see the query plans that my > functions are using? See my blog post on this: http://blog.endpoint.com/2008/12/why-is-my-function-slow.html It's also in the first page of hits when you google the subject line of this thread. :) - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 200912311751 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAks9Kz8ACgkQvJuQZxSWSsj72QCgvrcUOTF8cXyVkIZ29ky3YOmp p+IAoOmgyTICVXBmndHADua3ypNC0ctK =dZtE -----END PGP SIGNATURE-----
Thanks for your assistance guys, and special thanks to Greg for his blog page. This is exactly my problem which is resolved using dynamic SQL. I can now see how the function was behaving badly, and have been able to fix the query. Once again, thanks guys - and Happy New Year from a snow-covered part of the UK. chris