Help optimizing a slow index scan - Mailing list pgsql-performance

From Dan Harris
Subject Help optimizing a slow index scan
Date
Msg-id 4419DB50.10404@drivefaster.net
Whole thread Raw
Responses Re: Help optimizing a slow index scan
Re: Help optimizing a slow index scan
List pgsql-performance
explain analyze
select distinct eventmain.incidentid, eventmain.entrydate,
eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy
from eventmain, eventgeo
where
    eventmain.incidentid = eventgeo.incidentid and
    ( long > -104.998027962962 and long < -104.985957781349 ) and
    ( lat > 39.7075542720006 and lat < 39.7186195832938 ) and
    eventmain.entrydate > '2006-1-1 00:00' and
    eventmain.entrydate <= '2006-3-17 00:00'
order by
    eventmain.entrydate;

   QUERY
PLAN
                          


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=121313.81..121330.72 rows=451 width=178) (actual
time=723719.761..723726.875 rows=1408 loops=1)
   ->  Sort  (cost=121313.81..121314.94 rows=451 width=178) (actual
time=723719.755..723721.807 rows=1408 loops=1)
         Sort Key: eventmain.entrydate, eventmain.disposition,
eventmain.incidentid, eventgeo.reportingarea, eventgeo.beatid,
eventmain.finaltype, eventmain.casenumber, eventgeo.eventlocation,
eventmain.insertdate, eventmain.priority, eventgeo.long, eventgeo.lat,
eventgeo.geox, eventgeo.geoy
         ->  Nested Loop  (cost=0.00..121293.93 rows=451 width=178)
(actual time=1916.230..723712.900 rows=1408 loops=1)
               ->  Index Scan using eventgeo_lat_idx on eventgeo
(cost=0.00..85488.05 rows=10149 width=76) (actual time=0.402..393376.129
rows=22937 loops=1)
                     Index Cond: ((lat > 39.7075542720006::double
precision) AND (lat < 39.7186195832938::double precision))
                     Filter: ((long > -104.998027962962::double
precision) AND (long < -104.985957781349::double precision))
               ->  Index Scan using eventmain_incidentid_idx on
eventmain  (cost=0.00..3.52 rows=1 width=119) (actual
time=14.384..14.392 rows=0 loops=22937)
                     Index Cond: ((eventmain.incidentid)::text =
("outer".incidentid)::text)
                     Filter: ((entrydate > '2006-01-01
00:00:00'::timestamp without time zone) AND (entrydate <= '2006-03-17
00:00:00'::timestamp without time zone))

 Total runtime:  >>> 723729.238 ms(!) <<<



I'm trying to figure out why it's consuming so much time on the index
scan for eventgeo_lat_idx.  Also, I have an index on "long" that the
planner does not appear to find helpful.

There are 3.3 million records in eventmain and eventgeo.  The server has
a reasonably fast RAID10 setup with 16x 15k RPM drives and 12GB of RAM (
11GB listed as "cache" by vmstat ).  Running version 8.0.2 on linux
kernel 2.6.12.

I have just vacuum analyze'd both tables, rebuilt the eventgeo_lat_idx
index and reran the query multiple times to see if caching helped ( it
didn't help much ).   The server seems to be fine utilizing other fields
from this table but using "long" and "lat" seem to drag it down
significantly.

  Is it because there's such slight differences between the records,
since they are all within a few hundredths of a degree from each other?

Thanks for your time and ideas.

-Dan

pgsql-performance by date:

Previous
From: "Matthew Nuzum"
Date:
Subject: Re: 1 TB of memory
Next
From: Dan Harris
Date:
Subject: Re: Help optimizing a slow index scan