Thread: slow query using sub select

slow query using sub select

From
"Tim Jones"
Date:
Hi,
  I am having a problem with a sub select query being kinda slow.  The
query is as follows:

select batterycode, batterydescription, observationdate from Battery t1
where patientidentifier=611802158 and observationdate = (select
max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
and patientidentifier=611802158) order by batterydescription.

explain analyze:


'Sort (cost=1697.16..1697.16 rows=1 width=31) (actual
time=910.721..910.729 rows=22 loops=1)'
' Sort Key: batterydescription'
' -> Index Scan using ix_battery_patient on battery t1
(cost=0.00..1697.15 rows=1 width=31) (actual time=241.836..910.580
rows=22 loops=1)'
' Index Cond: (patientidentifier = 611802158)'
' Filter: (observationdate = (subplan))'
' SubPlan'
' -> Aggregate (cost=26.25..26.26 rows=1 width=8) (actual
time=9.666..9.667 rows=1 loops=94)'
' -> Bitmap Heap Scan on battery t2 (cost=22.23..26.25 rows=1 width=8)
(actual time=9.606..9.620 rows=7 loops=94)'
' Recheck Cond: ((patientidentifier = 611802158) AND
((batterycode)::text = ($0)::text))'
' -> BitmapAnd (cost=22.23..22.23 rows=1 width=0) (actual
time=9.596..9.596 rows=0 loops=94)'
' -> Bitmap Index Scan on ix_battery_patient (cost=0.00..2.20 rows=58
width=0) (actual time=0.039..0.039 rows=94 loops=94)'
' Index Cond: (patientidentifier = 611802158)'
' -> Bitmap Index Scan on ix_battery_code (cost=0.00..19.78 rows=2794
width=0) (actual time=9.514..9.514 rows=27323 loops=94)'
' Index Cond: ((batterycode)::text = ($0)::text)'
'Total runtime: 910.897 ms'

Basically I am just trying to display the batterycode with its most
recent date.  Is there a better way to do this query ?

thanks


Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

Re: slow query using sub select

From
Tom Lane
Date:
"Tim Jones" <TJones@optio.com> writes:
>   I am having a problem with a sub select query being kinda slow.  The
> query is as follows:

> select batterycode, batterydescription, observationdate from Battery t1
> where patientidentifier=611802158 and observationdate = (select
> max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
> and patientidentifier=611802158) order by batterydescription.

Yeah, this is essentially impossible for the planner to optimize,
because it doesn't see any way to de-correlate the subselect, so it does
it over again for every row.  You might find it works better if you cast
the thing as a SELECT DISTINCT ON problem (look at the "weather report"
example in the SELECT reference page).

            regards, tom lane

Re: slow query using sub select

From
"Jonathan Blitz"
Date:

> -----Original Message-----
> From: Tim Jones [mailto:TJones@optio.com]
> Sent: Tuesday, May 23, 2006 12:11 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] slow query using sub select
>
> Hi,
>   I am having a problem with a sub select query being kinda slow.  The
> query is as follows:
>
> select batterycode, batterydescription, observationdate from Battery t1
> where patientidentifier=611802158 and observationdate = (select
> max(observationdate) from Battery t2 where t2.batterycode=t1.batterycode
> and patientidentifier=611802158) order by batterydescription.


How about changing it into a standard join:


select t1.batterycode, t1.batterydescription, t2.observationdate
from Battery t1,
(Select batterycode ,max(observationdate) from Battery t2 where
patientidentifier=611802158 group by batterycode) AS T2
where t1. batterycode = t2. batterycode

Jonathan Blitz
AnyKey Limited
Israel

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.6.1/344 - Release Date: 05/19/2006



Re: slow query using sub select

From
"Tim Jones"
Date:
that worked like a champ nice call as always!

thanks

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, May 22, 2006 7:07 PM
To: Tim Jones
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] slow query using sub select

"Tim Jones" <TJones@optio.com> writes:
>   I am having a problem with a sub select query being kinda slow.  The

> query is as follows:

> select batterycode, batterydescription, observationdate from Battery
> t1 where patientidentifier=611802158 and observationdate = (select
> max(observationdate) from Battery t2 where
> t2.batterycode=t1.batterycode and patientidentifier=611802158) order
by batterydescription.

Yeah, this is essentially impossible for the planner to optimize,
because it doesn't see any way to de-correlate the subselect, so it does
it over again for every row.  You might find it works better if you cast
the thing as a SELECT DISTINCT ON problem (look at the "weather report"
example in the SELECT reference page).

            regards, tom lane