I took John's advice and tried to work with sub-selects. I tried this
variation, which actually seems like it would make a difference conceptually
since it drives on the person table quickly. But to my surprise, the query
runs at about 375 milliseconds. I think it's because it's going over that
shipment table multiple times, which is where the results are coming from.
select s.*, ss.*
from shipment s
inner join shipment_status ss on s.current_status_id=ss.id
inner join release_code rc on ss.release_code_id=rc.id
left outer join driver d on s.driver_id=d.id
left outer join carrier_code cc on s.carrier_code_id=cc.id
where s.id in (
select s.id
from person p
inner join carrier_to_person ctp on p.id=ctp.person_id
inner join carrier c on ctp.carrier_id=c.id
inner join carrier_code cc on cc.carrier_id = c.id
inner join shipment s on s.carrier_code_id = cc.id
where p.id = 355
and s.current_status_id is not null
and s.is_purged=false
)
and(rc.number='9' )
and(ss.date>=current_date-31 )
order by ss.date desc
*** Musing 1
Also, "s.current_status_id is not null" is an important filter that I forgot
to mention. In this example where p.id = 355, it filters out 90% of the
rows. In general, that filter ratio is 0.46 though, which is not quite so
high. However, this filter gets better over time because more and more
users will use a filter that will make this value null. It's still not as
strong as person though and probably never will be. But I thought I'd
mention it nonetheless.
*** Musing 2
I do think that the filter "ss.date>=current_date-31" is slowing this query
down. I don't think it's the mention of "current_date" or even that it's
dynamic instead of static. I think the range is just too big. For example,
if I use:
and ss.date between '2005-02-01 00:00:00' and '2005-02-28 23:59:59'
The query still results in 250 milliseconds. But if I make the range very
small - say Feb 22nd of 2005:
and ss.date between '2005-02-22 00:00:00' and '2005-02-22 23:59:59'
Now the entire query runs in 47 milliseconds on average. If I can't make
this query perform any better, should I change the user interface to select
the date instead of showing the last 31 days to benefit from this single-day
filter? This causes more clicks to select the day (like from a calendar),
but most users probably aren't interested in seeing the entire listing
anyway. However, it's a very important requirement that users know that
shipment enteries exist in the last 31 days (because they are usually
sure-fire problems if they are still in this query after a few days).
I guess I'm wondering if tuning the query is futile and I should get the
requirements changed, or is there something I can do to really speed it up?
Thanks again,
Ken