On Aug 16, 2006, at 3:51 PM, Tom Lane wrote:
>> /* Select all sheep who's most recent transfer was into the
>> subject flock */
>> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
>> FROM SHEEP_FLOCK f1 JOIN
>> /* The last transfer date for each sheep */
>> (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
>> FROM SHEEP_FLOCK f
>> GROUP BY f.regn_no) f2
>> ON f1.regn_no = f2.regn_no
>> WHERE f1.flock_no = '1359'
>> AND f1.transfer_date = f2.last_xfer_date
>
> This seems pretty closely related to this recent thread:
> http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
> in which the OP is doing a very similar kind of query in almost
> exactly
> the same way.
>
> I can't help thinking that there's probably a better way to phrase
> this
> type of query in SQL, though it's not jumping out at me what that is.
I don't know about better, but I tend to phrase these in a quite
different way that's (hopefully) equivalent:
select latest.regn_no,
latest.transfer_date as date_in
from sheep_flock latest
where not exists (
select 'x'
from sheep_flock even_later
where latest.regn_no = even_later.regn_no
and latest.transfer_date < even_later.transfer_date)
and latest.flock_no = '1359'
There's no MAX() or DISTINCT here, so maybe this is easier to optimize?
--
Scott Lamb <http://www.slamb.org/>