Poor query performance - Mailing list pgsql-performance

From Alex
Subject Poor query performance
Date
Msg-id 7345caea-3915-449a-927b-00db408550bd@x5g2000prf.googlegroups.com
Whole thread Raw
Responses Re: Poor query performance  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
Below is a query that takes 16 seconds on the first run.  I am having
generally poor performance for queries in uncached areas of the data
and often mediocre (500ms-2s+) performance generallly, although
sometimes it's very fast.  All the queries are pretty similar and use
the indexes this way.

I've been trying to tune this thing with little luck.  There are about
1.5M records.  It's using the index properly.  Settings are:
work_mem=20MB, shared_buffers=128MB, effective_cache_size=1024MB.

I have run ANALYZE and VACUUM FULL recently.

The whole database is around 16GB.   The server is an ec2 instance
with 5 compute units in two cores (1 unit is one 2Ghz processor) and
1.7Gb of RAM.

Swapping seems to be minimal.

Note that the ANALYZE is from my slow query logger, so the numbers
don't match the time the uncached query took.

There are 118K rows in this select.  It is possible the sort is the
issue, but that's why I have 20M working memory.  Do I really need
more than that?

Slow query: (16.852746963501) [0] SELECT id FROM "source_listings"
WHERE (post_time BETWEEN '2009-07-02 14:19:29.520886' AND '2009-07-11
14:19:29.520930' AND ((geo_lon BETWEEN 10879358 AND 10909241 AND
geo_lat BETWEEN 13229080 AND 13242719)) AND city = 'boston')  ORDER BY
post_time DESC LIMIT 108 OFFSET 0
Limit  (cost=30396.63..30396.90 rows=108 width=12) (actual
time=1044.575..1044.764 rows=108 loops=1)
  ->  Sort  (cost=30396.63..30401.47 rows=1939 width=12) (actual
time=1044.573..1044.630 rows=108 loops=1)
        Sort Key: post_time
        Sort Method:  top-N heapsort  Memory: 21kB
        ->  Bitmap Heap Scan on source_listings
(cost=23080.81..30321.44 rows=1939 width=12) (actual
time=321.111..952.704 rows=118212 loops=1)
              Recheck Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
              ->  Bitmap Index Scan on sl_city_etc
(cost=0.00..23080.33 rows=1939 width=0) (actual time=309.007..309.007
rows=118212 loops=1)
                    Index Cond: ((city = 'boston'::text) AND
(post_time >= '2009-07-02 14:19:29.520886'::timestamp without time
zone) AND (post_time <= '2009-07-11 14:19:29.52093'::timestamp without
time zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND
(geo_lon >= 10879358) AND (geo_lon <= 10909241))
Total runtime: 1045.683 ms



Even without the sort performance is poor:

cribq=# EXPLAIN ANALYZE SELECT count(id) FROM "source_listings" WHERE
(post_time BETWEEN '2009-07-02 14:19:29.520886' AND '2009-07-11
14:19:29.520930' AND ((geo_lon BETWEEN 10879358 AND 10909241 AND
geo_lat BETWEEN 13229080 AND 13242719)) AND city = 'boston');

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=30326.29..30326.30 rows=1 width=4) (actual
time=847.967..847.968 rows=1 loops=1)
   ->  Bitmap Heap Scan on source_listings  (cost=23080.81..30321.44
rows=1939 width=4) (actual time=219.505..769.878 rows=118212 loops=1)
         Recheck Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
         ->  Bitmap Index Scan on sl_city_etc  (cost=0.00..23080.33
rows=1939 width=0) (actual time=206.981..206.981 rows=118212 loops=1)
               Index Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
 Total runtime: 848.816 ms

pgsql-performance by date:

Previous
From: Alex
Date:
Subject: Re: Poor query performance
Next
From: Suvankar Roy
Date:
Subject: Performance comparison between Postgres and Greenplum