Thread: slow query using sub select
Hi, I am having a problem with a sub select query being kinda slow. The query is as follows: select batterycode, batterydescription, observationdate from Battery t1 where patientidentifier=611802158 and observationdate = (select max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode and patientidentifier=611802158) order by batterydescription. explain analyze: 'Sort (cost=1697.16..1697.16 rows=1 width=31) (actual time=910.721..910.729 rows=22 loops=1)' ' Sort Key: batterydescription' ' -> Index Scan using ix_battery_patient on battery t1 (cost=0.00..1697.15 rows=1 width=31) (actual time=241.836..910.580 rows=22 loops=1)' ' Index Cond: (patientidentifier = 611802158)' ' Filter: (observationdate = (subplan))' ' SubPlan' ' -> Aggregate (cost=26.25..26.26 rows=1 width=8) (actual time=9.666..9.667 rows=1 loops=94)' ' -> Bitmap Heap Scan on battery t2 (cost=22.23..26.25 rows=1 width=8) (actual time=9.606..9.620 rows=7 loops=94)' ' Recheck Cond: ((patientidentifier = 611802158) AND ((batterycode)::text = ($0)::text))' ' -> BitmapAnd (cost=22.23..22.23 rows=1 width=0) (actual time=9.596..9.596 rows=0 loops=94)' ' -> Bitmap Index Scan on ix_battery_patient (cost=0.00..2.20 rows=58 width=0) (actual time=0.039..0.039 rows=94 loops=94)' ' Index Cond: (patientidentifier = 611802158)' ' -> Bitmap Index Scan on ix_battery_code (cost=0.00..19.78 rows=2794 width=0) (actual time=9.514..9.514 rows=27323 loops=94)' ' Index Cond: ((batterycode)::text = ($0)::text)' 'Total runtime: 910.897 ms' Basically I am just trying to display the batterycode with its most recent date. Is there a better way to do this query ? thanks Tim Jones Healthcare Project Manager Optio Software, Inc. (770) 576-3555
"Tim Jones" <TJones@optio.com> writes: > I am having a problem with a sub select query being kinda slow. The > query is as follows: > select batterycode, batterydescription, observationdate from Battery t1 > where patientidentifier=611802158 and observationdate = (select > max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode > and patientidentifier=611802158) order by batterydescription. Yeah, this is essentially impossible for the planner to optimize, because it doesn't see any way to de-correlate the subselect, so it does it over again for every row. You might find it works better if you cast the thing as a SELECT DISTINCT ON problem (look at the "weather report" example in the SELECT reference page). regards, tom lane
> -----Original Message----- > From: Tim Jones [mailto:TJones@optio.com] > Sent: Tuesday, May 23, 2006 12:11 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] slow query using sub select > > Hi, > I am having a problem with a sub select query being kinda slow. The > query is as follows: > > select batterycode, batterydescription, observationdate from Battery t1 > where patientidentifier=611802158 and observationdate = (select > max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode > and patientidentifier=611802158) order by batterydescription. How about changing it into a standard join: select t1.batterycode, t1.batterydescription, t2.observationdate from Battery t1, (Select batterycode ,max(observationdate) from Battery t2 where patientidentifier=611802158 group by batterycode) AS T2 where t1. batterycode = t2. batterycode Jonathan Blitz AnyKey Limited Israel -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.392 / Virus Database: 268.6.1/344 - Release Date: 05/19/2006
that worked like a champ nice call as always! thanks Tim Jones Healthcare Project Manager Optio Software, Inc. (770) 576-3555 -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Monday, May 22, 2006 7:07 PM To: Tim Jones Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] slow query using sub select "Tim Jones" <TJones@optio.com> writes: > I am having a problem with a sub select query being kinda slow. The > query is as follows: > select batterycode, batterydescription, observationdate from Battery > t1 where patientidentifier=611802158 and observationdate = (select > max(observationdate) from Battery t2 where > t2.batterycode=t1.batterycode and patientidentifier=611802158) order by batterydescription. Yeah, this is essentially impossible for the planner to optimize, because it doesn't see any way to de-correlate the subselect, so it does it over again for every row. You might find it works better if you cast the thing as a SELECT DISTINCT ON problem (look at the "weather report" example in the SELECT reference page). regards, tom lane