Thread: Slow queries in PL/PGSQL function
I have a couple queries in a PL/PGSQL function which execute very slowly (around one minute each) which execute in .5 second when not executed from within the function. Is there any way to determine why this is happening? I couldn't figure out how to run EXPLAIN ANALYZE from within the function. The queries aren't very complicated. Here is an example query and plan. About 240K rows in x_rbl_ips, 7.5M rows in filter_ips. explain analyze SELECT DISTINCT i_ip FROM x_rbl_ips LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00' AND filter_ips.i_filter_ip IS NOT NULL AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer QUERY PLAN Unique (cost=5703.70..5703.81 rows=23 width=4) (actual time=524.439..524.439 rows=0 loops=1) -> Sort (cost=5703.70..5703.75 rows=23 width=4) (actual time=524.438..524.438 rows=0 loops=1) Sort Key: x_rbl_ips.i_ip -> Nested Loop (cost=0.00..5703.18 rows=23 width=4) (actual time=524.028..524.028 rows=0 loops=1) -> Seq Scan on x_rbl_ips (cost=0.00..5616.56 rows=23 width=4) (actual time=412.738..415.842 rows=1738 loops=1) Filter: (dts_last_modified > '2004-02-18 22:44:15.901689'::timestamp without time zone) -> Index Scan using filter_ips_source_ip on filter_ips (cost=0.00..3.75 rows=1 width=4) (actual time=0.061..0.061 rows=0 loops=1738) Index Cond: ("outer".i_ip = filter_ips.i_filter_ip) Filter: ((i_filter_ip IS NOT NULL) AND ((i_filter_type_flags & 33554432) <> 33554432)) Total runtime: 524.868 ms
On Thursday 19 February 2004 23:00, Jim Crate wrote: > I have a couple queries in a PL/PGSQL function which execute very slowly > (around one minute each) which execute in .5 second when not executed from > within the function. Is there any way to determine why this is happening? > I couldn't figure out how to run EXPLAIN ANALYZE from within the function. You can't - hmm, looking here: http://developer.postgresql.org/todo.php I can't even see a TODO. I'll suggest it on the hackers list. > explain analyze SELECT DISTINCT i_ip > FROM x_rbl_ips > LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip > WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00' > AND filter_ips.i_filter_ip IS NOT NULL > AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer I'm guessing that the values in your query are variables/parameters in the plpgsql function? The problem is that the plan is compiled when the function is first run, so it doesn't know what values you will use. You might tend to use values that make sense to index, but it can't tell. Try rephrasing this query as an EXECUTE ''query-string'' and see if that makes the problem go away. -- Richard Huxton Archonet Ltd
Richard Huxton <dev@archonet.com> writes: > On Thursday 19 February 2004 23:00, Jim Crate wrote: >> explain analyze SELECT DISTINCT i_ip >> FROM x_rbl_ips >> LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip >> WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00' >> AND filter_ips.i_filter_ip IS NOT NULL >> AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer > I'm guessing that the values in your query are variables/parameters in the > plpgsql function? The problem is that the plan is compiled when the function > is first run, so it doesn't know what values you will use. You might tend to > use values that make sense to index, but it can't tell. Specifically, the only part of that that looks indexable is the dts_last_modified constraint. If it's always "dts_last_modified > some-time-in-the-recent-past" then the range of values scanned is going to be small enough to make an indexscan worthwhile. But if what the planner sees is "dts_last_modified > variable" then it isn't gonna risk an indexscan, because in the general case that could mean scanning a large part of the table, and the indexscan would be tremendously slow. What you can do to work around this (I'm assuming dts_last_modified never contains values in the future) is add a dummy constraint: WHERE x_rbl_ips.dts_last_modified > variable AND x_rbl_ips.dts_last_modified <= now() AND other-stuff Now what the planner sees is dts_last_modified being constrained to a range, rather than an open-ended interval. It still has to guess about how much of the index will be scanned, but its guess in this scenario is much smaller and it should usually pick the indexscan instead. BTW, in recent releases you can investigate planner choices involving queries with variables by using PREPARE and EXPLAIN EXECUTE. For example PREPARE myq(timestamptz) AS ... WHERE x_rbl_ips.dts_last_modified > $1 ... EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00'); This allows you to exactly reproduce the conditions that the planner has to work under when planning a query from a plpgsql function. regards, tom lane
I have a very similar problem. I put the following SQL into a function: SELECT VS.*,VL.TEL1,SC.CONTRACT_ID,SC.CONTRACT_REF, SC.MAX_HOURS, SC.MIN_HOURS, (SELECT COUNT(*) FROM TIMESHEET_DETAIL JOIN MAIN_ORDER ON (MAIN_ORDER.ORDER_ID = TIMESHEET_DETAIL.ORDER_ID AND MAIN_ORDER.CLIENT_ID = $3) WHERE TIMESHEET_DETAIL.CONTRACT_ID = SC.CONTRACT_ID) AS VISITS, (SELECT (SUM(R.DURATION+1))/60.0 FROM ORDER_REQT R JOIN STAFF_BOOKING B ON (B.REQT_ID = R.REQT_ID) JOIN BOOKING_PLAN BP ON (BP.BOOKING_ID = B.BOOKING_ID) WHERE B.CONTRACT_ID = SC.CONTRACT_ID AND BP.BOOKING_DATE BETWEEN $1 AND $2) AS RHOURS FROM VSTAFF VS JOIN STAFF_CONTRACT SC ON (SC.STAFF_ID = VS.STAFF_ID) JOIN VLOCATION VL ON (VL.LOCATION_ID = VS.LOCATION_ID) JOIN SEARCH_REQT_RESULT SR ON (SR.STAFF_ID = VS.STAFF_ID) WHERE SR.SEARCH_ID = $4 AND SC.CONTRACT_ID IN (SELECT C.CONTRACT_ID FROM STAFF_PRODUCT P,STAFF_CONTRACT C WHERE P.CONTRACT_ID=C.CONTRACT_ID AND C.STAFF_ID = VS.STAFF_ID AND P.PRODUCT_ID IN (SELECT PRODUCT_ID FROM SEARCH_ORDER_REQT WHERE SEARCH_ID = $4) AND C.AVAIL_DATE_FROM <= $1 AND C.AVAIL_DATE_TO >= $2 GROUP BY C.CONTRACT_ID HAVING (COUNT(C.CONTRACT_ID) = (SELECT COUNT(DISTINCT PRODUCT_ID) FROM SEARCH_ORDER_REQT WHERE SEARCH_ID = $4))); When executed from the client and substituting constants for the variables ($1,$2,$3) it takes 1 second. When executed from within a function using exactly the same values it takes 30 seconds! Using your method to get the execution plan with variables I can see it has switched to a sequential scan on the BOOKING_PLAN table (250K records) instead of the index it uses otherwise. I have spent a while optimising this and tested with quite a range of date values. It always uses the indexes when executed from the client. The date ranges given are usually quite small. I am trying to migrate from MS SQLSever to PG and this has now stopped me for a while. I could transfer all the stored procedures into client code, but that seems very inefficient. Is there someway to force the use of an index. Or at least get the backend to substitue the parameters in a function before doing the first query plan so it has more typical values to work with? Regards, Gary. On Fri, 20 Feb 2004 09:20:24 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: >Richard Huxton <dev@archonet.com> writes: >> On Thursday 19 February 2004 23:00, Jim Crate wrote: >>> explain analyze SELECT DISTINCT i_ip >>> FROM x_rbl_ips >>> LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip >>> WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00' >>> AND filter_ips.i_filter_ip IS NOT NULL >>> AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer > >> I'm guessing that the values in your query are variables/parameters in the >> plpgsql function? The problem is that the plan is compiled when the function >> is first run, so it doesn't know what values you will use. You might tend to >> use values that make sense to index, but it can't tell. > >Specifically, the only part of that that looks indexable is the >dts_last_modified constraint. If it's always "dts_last_modified > >some-time-in-the-recent-past" then the range of values scanned is going >to be small enough to make an indexscan worthwhile. But if what the >planner sees is "dts_last_modified > variable" then it isn't gonna risk >an indexscan, because in the general case that could mean scanning a >large part of the table, and the indexscan would be tremendously slow. > >What you can do to work around this (I'm assuming dts_last_modified >never contains values in the future) is add a dummy constraint: > > WHERE x_rbl_ips.dts_last_modified > variable > AND x_rbl_ips.dts_last_modified <= now() > AND other-stuff > >Now what the planner sees is dts_last_modified being constrained to a >range, rather than an open-ended interval. It still has to guess about >how much of the index will be scanned, but its guess in this scenario >is much smaller and it should usually pick the indexscan instead. > >BTW, in recent releases you can investigate planner choices involving >queries with variables by using PREPARE and EXPLAIN EXECUTE. For >example > > PREPARE myq(timestamptz) AS > ... > WHERE x_rbl_ips.dts_last_modified > $1 > ... > > EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00'); > >This allows you to exactly reproduce the conditions that the planner has >to work under when planning a query from a plpgsql function. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend
Gary Doades <gpd@cwcom.net> writes: > Is there someway to force the use of an index. Or at least get the > backend to substitue the parameters in a function before doing the > first query plan so it has more typical values to work with? Could we see the EXPLAIN ANALYZE output for your problem query? Table schemas (column data types and available indexes) are necessary background for this type of question as well. You might want to take the question to pgsql-performance, too ... it's a tad off topic for -general. regards, tom lane
Thanks for the help with this. I followed Richard's advice and changed the function to EXECUTE the query so it would be planned with the actual values at the time the query would be run, and it worked as expected. >Specifically, the only part of that that looks indexable is the >dts_last_modified constraint. If it's always "dts_last_modified > >some-time-in-the-recent-past" then the range of values scanned is going >to be small enough to make an indexscan worthwhile. But if what the >planner sees is "dts_last_modified > variable" then it isn't gonna risk >an indexscan, because in the general case that could mean scanning a >large part of the table, and the indexscan would be tremendously slow. Actually, dts_last_modified was not indexed. A sequential scan on dts_last_modified in x_rbl_ips was very fast even with 250K records, around .5 second for the entire query. The only reason I could think that the query would take up to several minutes when run from the function is that it was joining the entire 250K rows of x_rbl_ips with the 7M rows of filter_ips before filtering on x_rbl_ips.dts_last_modified. >What you can do to work around this (I'm assuming dts_last_modified >never contains values in the future) is add a dummy constraint: > > WHERE x_rbl_ips.dts_last_modified > variable > AND x_rbl_ips.dts_last_modified <= now() > AND other-stuff > >Now what the planner sees is dts_last_modified being constrained to a >range, rather than an open-ended interval. It still has to guess about >how much of the index will be scanned, but its guess in this scenario >is much smaller and it should usually pick the indexscan instead. Running the query with FOR loop_rec IN EXECUTE turned out to be so easy and fast that I didn't try this. However, as I mentioned, x_rbl_ips.dts_last_modified wasn't indexed in the first place so I don't know if it would have helped. >BTW, in recent releases you can investigate planner choices involving >queries with variables by using PREPARE and EXPLAIN EXECUTE. For This will be very useful for future testing. I spend quite a bit of time looking through the docs, and hadn't found this yet. Thanks for the help. -- Jim Crate Deep Sky Technologies, Inc.
Thank for the reply. I will move this onto the perfromance group, except that for some odd reason I cant see a performance group on usenet. I will subscribe to the mailing list.... I will gladly upload the schema (script) or anything else that anyone wants, but it may not be appropriate for a newsgroup or mailing list. If you want anything else please let me know where to send it. This is not the only query that is causing me a problem. I have about 30 stored procedures to move from SQLServer. 15 of the 21 so far moved over to PG have similar problems. Very fast when submitted as a complete SQL statement with constants. Very slow when used as functions. Here is the explain for ths query using constants: QUERY PLAN Hash Join (cost=629.34..213908.73 rows=34 width=151) (actual time=110.000..1518.000 rows=98 loops=1) Hash Cond: ("outer".staff_id = "inner".staff_id) Join Filter: (subplan) InitPlan -> Seq Scan on wruserarea (cost=1.46..3.29 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((uid = $4) AND (area_id = 1)) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((username)::name = "current_user"()) -> Seq Scan on staff_contract sc (cost=0.00..10.35 rows=335 width=34) (actual time=0.000..1.000 rows=335 loops=1) -> Hash (cost=625.88..625.88 rows=66 width=125) (actual time=10.000..10.000 rows=0 loops=1) -> Nested Loop (cost=56.55..625.88 rows=66 width=125) (actual time=6.000..10.000 rows=98 loops=1) -> Merge Join (cost=56.55..73.06 rows=101 width=111) (actual time=6.000..7.000 rows=98 loops=1) Merge Cond: ("outer".staff_id = "inner".staff_id) -> Index Scan using staff_pkey on staff (cost=7.74..21.90 rows=332 width=107) (actual time=4.000..4.000 rows=332 loops=1) Filter: ((hashed subplan) OR $5) SubPlan -> Seq Scan on staff_area (cost=3.16..7.52 rows=88 width=4) (actual time=0.000..0.000 rows=18 loops=1) Filter: ((hashed subplan) OR (area_id = 1)) SubPlan -> Seq Scan on wruserarea (cost=1.46..3.16 rows=3 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (uid = $1) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((username)::name = "current_user"()) -> Sort (cost=48.81..49.06 rows=101 width=4) (actual time=2.000..2.000 rows=98 loops=1) Sort Key: sr.staff_id -> Seq Scan on search_reqt_result sr (cost=0.00..45.45 rows=101 width=4) (actual time=0.000..2.000 rows=98 loops=1) Filter: (search_id = 143) -> Index Scan using location_pkey on "location" (cost=0.00..5.46 rows=1 width=18) (actual time=0.000..0.000 rows=1 loops=98) Index Cond: ("location".location_id = "outer".location_id) Filter: ((area_id = 1) OR (subplan)) SubPlan -> Seq Scan on wruserarea (cost=1.46..3.44 rows=2 width=4) (never executed) Filter: ((uid = $6) AND ((area_id = 1) OR (area_id = $7))) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (never executed) Filter: ((username)::name = "current_user"()) SubPlan -> GroupAggregate (cost=3.10..639.23 rows=1 width=4) (actual time=1.765..1.765 rows=1 loops=98) Filter: (count(contract_id) = $9) InitPlan -> Aggregate (cost=1.55..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) -> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (search_id = 143) -> Nested Loop IN Join (cost=1.55..637.67 rows=1 width=4) (actual time=1.439..1.765 rows=1 loops=98) Join Filter: ("outer".product_id = "inner".product_id) -> Nested Loop (cost=0.00..631.93 rows=186 width=8) (actual time=0.347..1.378 rows=245 loops=98) -> Index Scan using staff_contract_pkey on staff_contract c (cost=0.00..15.77 rows=1 width=4) (actual time=0.255..0.449 rows=1 loops=98) Filter: ((staff_id = $8) AND (avail_date_from <= '2003-06-12'::date) AND (avail_date_to >= '2003-06-18'::date)) -> Index Scan using staff_product_contract_id_key on staff_product p (cost=0.00..613.80 rows=189 width=8) (actual time=0.061..0.571 rows=245 loops=98) Index Cond: (p.contract_id = "outer".contract_id) -> Materialize (cost=1.55..1.56 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=23972) -> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (search_id = 143) -> Aggregate (cost=2252.27..2252.27 rows=1 width=2) (actual time=12.673..12.673 rows=1 loops=98) -> Nested Loop (cost=2049.93..2252.19 rows=28 width=2) (actual time=8.959..12.612 rows=13 loops=98) -> Hash Join (cost=2049.93..2167.47 rows=28 width=4) (actual time=8.908..12.337 rows=13 loops=98) Hash Cond: ("outer".booking_id = "inner".booking_id) -> Index Scan using booking_plan_idx2 on booking_plan bp (cost=0.00..98.15 rows=3822 width=4) (actual time=0.755..6.520 rows=4693 loops=98) Index Cond: ((booking_date >= '2003-06-12'::date) AND (booking_date <= '2003-06-18'::date)) -> Hash (cost=2048.37..2048.37 rows=624 width=8) (actual time=2.694..2.694 rows=0 loops=98) -> Index Scan using staff_book_idx5 on staff_booking b (cost=0.00..2048.37 rows=624 width=8) (actual time=0.041..2.031 rows=358 loops=98) Index Cond: (contract_id = $0) -> Index Scan using order_reqt_pkey on order_reqt r (cost=0.00..3.01 rows=1 width=6) (actual time=0.014..0.014 rows=1 loops=1312) Index Cond: ("outer".reqt_id = r.reqt_id) -> Aggregate (cost=3387.38..3387.38 rows=1 width=0) (actual time=0.806..0.806 rows=1 loops=98) -> Nested Loop (cost=0.00..3387.37 rows=2 width=0) (actual time=0.765..0.786 rows=3 loops=98) -> Index Scan using main_order_idx on main_order (cost=0.00..5.86 rows=2 width=4) (actual time=0.041..0.051 rows=1 loops=98) Index Cond: (client_id = 1011001947) -> Index Scan using timesheet_detail_idx on timesheet_detail (cost=0.00..1690.72 rows=3 width=4) (actual time=0.714..0.735 rows=3 loops=98) Index Cond: ("outer".order_id = timesheet_detail.order_id) Filter: (contract_id = $0) Total runtime: 1530.000 ms ------------------------------------------------------------------------------------------------------------------------------------------------- and here is the explain for the same query with parameters: Hash Join (cost=651.46..450445.01 rows=35 width=151) (actual time=304.000..23814.000 rows=98 loops=1) Hash Cond: ("outer".staff_id = "inner".staff_id) Join Filter: (subplan) InitPlan -> Seq Scan on wruserarea (cost=1.46..3.29 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((uid = $4) AND (area_id = 1)) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((username)::name = "current_user"()) -> Seq Scan on staff_contract sc (cost=0.00..10.35 rows=335 width=34) (actual time=0.000..1.000 rows=335 loops=1) -> Hash (cost=648.00..648.00 rows=68 width=125) (actual time=8.000..8.000 rows=0 loops=1) -> Nested Loop (cost=56.71..648.00 rows=68 width=125) (actual time=4.000..8.000 rows=98 loops=1) -> Merge Join (cost=56.71..73.28 rows=105 width=111) (actual time=3.000..6.000 rows=98 loops=1) Merge Cond: ("outer".staff_id = "inner".staff_id) -> Index Scan using staff_pkey on staff (cost=7.74..21.90 rows=332 width=107) (actual time=1.000..2.000 rows=332 loops=1) Filter: ((hashed subplan) OR $5) SubPlan -> Seq Scan on staff_area (cost=3.16..7.52 rows=88 width=4) (actual time=0.000..0.000 rows=18 loops=1) Filter: ((hashed subplan) OR (area_id = 1)) SubPlan -> Seq Scan on wruserarea (cost=1.46..3.16 rows=3 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (uid = $1) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((username)::name = "current_user"()) -> Sort (cost=48.97..49.24 rows=105 width=4) (actual time=2.000..2.000 rows=98 loops=1) Sort Key: sr.staff_id -> Seq Scan on search_reqt_result sr (cost=0.00..45.45 rows=105 width=4) (actual time=0.000..2.000 rows=98 loops=1) Filter: (search_id = $4) -> Index Scan using location_pkey on "location" (cost=0.00..5.46 rows=1 width=18) (actual time=0.010..0.010 rows=1 loops=98) Index Cond: ("location".location_id = "outer".location_id) Filter: ((area_id = 1) OR (subplan)) SubPlan -> Seq Scan on wruserarea (cost=1.46..3.44 rows=2 width=4) (never executed) Filter: ((uid = $6) AND ((area_id = 1) OR (area_id = $7))) InitPlan -> Seq Scan on wruser (cost=0.00..1.46 rows=1 width=4) (never executed) Filter: ((username)::name = "current_user"()) SubPlan -> GroupAggregate (cost=3.10..652.99 rows=1 width=4) (actual time=1.847..1.847 rows=1 loops=98) Filter: (count(contract_id) = $9) InitPlan -> Aggregate (cost=1.55..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) -> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (search_id = $4) -> Nested Loop IN Join (cost=1.55..651.43 rows=1 width=4) (actual time=1.449..1.837 rows=1 loops=98) Join Filter: ("outer".product_id = "inner".product_id) -> Nested Loop (cost=0.00..645.69 rows=186 width=8) (actual time=0.429..1.306 rows=245 loops=98) -> Index Scan using staff_contract_pkey on staff_contract c (cost=0.00..17.45 rows=1 width=4) (actual time=0.276..0.398 rows=1 loops=98) Filter: ((staff_id = $8) AND ((avail_date_from)::timestamp without time zone <= $1) AND ((avail_date_to)::timestamp without time zone >= $1)) -> Index Scan using staff_product_contract_id_key on staff_product p (cost=0.00..625.85 rows=191 width=8) (actual time=0.133..0.582 rows=245 loops=98) Index Cond: (p.contract_id = "outer".contract_id) -> Materialize (cost=1.55..1.56 rows=1 width=4) (actual time=0.000..0.001 rows=1 loops=23972) -> Seq Scan on search_order_reqt (cost=0.00..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: (search_id = $4) -> Aggregate (cost=8760.23..8760.24 rows=1 width=2) (actual time=236.245..236.245 rows=1 loops=98) -> Nested Loop (cost=2009.53..8760.21 rows=9 width=2) (actual time=226.378..236.194 rows=14 loops=98) -> Hash Join (cost=2009.53..8732.97 rows=9 width=4) (actual time=226.286..235.755 rows=14 loops=98) Hash Cond: ("outer".booking_id = "inner".booking_id) -> Seq Scan on booking_plan bp (cost=0.00..6716.98 rows=1274 width=4) (actual time=209.684..229.684 rows=4704 loops=98) Filter: (((booking_date)::timestamp without time zone >= $1) AND ((booking_date)::timestamp without time zone <= $2)) -> Hash (cost=2008.02..2008.02 rows=606 width=8) (actual time=3.357..3.357 rows=0 loops=98) -> Index Scan using staff_book_idx5 on staff_booking b (cost=0.00..2008.02 rows=606 width=8) (actual time=0.163..3.061 rows=358 loops=98) Index Cond: (contract_id = $0) -> Index Scan using order_reqt_pkey on order_reqt r (cost=0.00..3.01 rows=1 width=6) (actual time=0.024..0.026 rows=1 loops=1326) Index Cond: ("outer".reqt_id = r.reqt_id) -> Aggregate (cost=3443.91..3443.91 rows=1 width=0) (actual time=4.745..4.745 rows=1 loops=98) -> Nested Loop (cost=0.00..3443.90 rows=2 width=0) (actual time=4.531..4.724 rows=3 loops=98) -> Index Scan using main_order_idx on main_order (cost=0.00..5.87 rows=2 width=4) (actual time=0.143..0.153 rows=1 loops=98) Index Cond: (client_id = $3) -> Index Scan using timesheet_detail_idx on timesheet_detail (cost=0.00..1718.97 rows=3 width=4) (actual time=4.378..4.571 rows=3 loops=98) Index Cond: ("outer".order_id = timesheet_detail.order_id) Filter: (contract_id = $0) Total runtime: 23853.000 ms The only real difference I can see is the booking_plan table using a sequential scan. Any help appreciated Thanks, Gary. On Sat, 21 Feb 2004 11:15:50 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: >Gary Doades <gpd@cwcom.net> writes: >> Is there someway to force the use of an index. Or at least get the >> backend to substitue the parameters in a function before doing the >> first query plan so it has more typical values to work with? > >Could we see the EXPLAIN ANALYZE output for your problem query? >Table schemas (column data types and available indexes) are necessary >background for this type of question as well. > >You might want to take the question to pgsql-performance, too ... >it's a tad off topic for -general. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
It's OK, I'm an idiot. I read soooo many times that you need to get the data types in the query the same as the column types or indexes won't work. So I go and get it wrong! I had defined the parameter to the function as timestamp (equivalent to SQLServer datetime), but the column data type is date. SQLServer does not have a date type so the datetime parameters work, but straight conversion causes PG not to use indexes. I guess that many might be put off PG because of this without bothering to find out why. I think I read somewhere that there are plans to improve PGs behaviour in this area, is this so? Cheers, Gary. On Sat, 21 Feb 2004 11:15:50 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: >Gary Doades <gpd@cwcom.net> writes: >> Is there someway to force the use of an index. Or at least get the >> backend to substitue the parameters in a function before doing the >> first query plan so it has more typical values to work with? > >Could we see the EXPLAIN ANALYZE output for your problem query? >Table schemas (column data types and available indexes) are necessary >background for this type of question as well. > >You might want to take the question to pgsql-performance, too ... >it's a tad off topic for -general. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match