Chad,
If you take a look at the Explain content, you'll see where the slow-down is:
> Limit (cost=96929.63..97822.45 rows=35713 width=36) (actual
time=42477.82..43491.69 rows=100000 loops=1)
> -> Unique (cost=96929.63..97822.46 rows=35713 width=36) (actual
time=42477.82..43325.87 rows=100001 loops=1)
> -> Sort (cost=96929.63..96929.63 rows=357132 width=36) (actual
time=42477.80..42735.18 rows=102151 loops=1)
These three lines are the selecting unique values and sorting and terminating
the result set. This is most of your computing time; see the "actual time
=42477"
> -> Hash Join (cost=9.59..41831.14 rows=357132 width=36)
(actual time=25.29..16456.26 rows=352194 loops=1)
> -> Seq Scan on lists l (cost=0.00..32881.18
rows=895244 width=29) (actual time=0.10..11342.50 rows=878098 loops=1)
> -> Hash (cost=9.22..9.22 rows=148 width=7) (actual
time=23.80..23.80 rows=0 loops=1)
> -> Seq Scan on timezone tz (cost=0.00..9.22
rows=148 width=7) (actual time=21.72..23.45 rows=148 loops=1)
This is you join to the area codes. It's ignoring the indexes, because the
number of records in timezone is so small compared to the number in lists.
this isn't a problem, though, because as you can see the join operation takes
only a few milliseconds at a minimum.
> Total runtime: 46247.79 msec
The way I read this, 95% of the time is being spent on the DISTINCT. Tom, am
I reading this right?
Try:
1) Indexing lists.full_phone.
2) Check the speed without the DISTINCT as a benckmark.
3) Increasing the amount of memory available to your queries by altering the
postgresql.conf settings and possibly adding more RAM or improving your disk
access speed.
--
-Josh Berkus
Aglio Database Solutions
San Francisco