Thread: Poor query performance

Poor query performance

From
Alex
Date:
Below is a query that takes 16 seconds on the first run.  I am having
generally poor performance for queries in uncached areas of the data
and often mediocre (500ms-2s+) performance generallly, although
sometimes it's very fast.  All the queries are pretty similar and use
the indexes this way.

I've been trying to tune this thing with little luck.  There are about
1.5M records.  It's using the index properly.  Settings are:
work_mem=20MB, shared_buffers=128MB, effective_cache_size=1024MB.

I have run ANALYZE and VACUUM FULL recently.

The whole database is around 16GB.   The server is an ec2 instance
with 5 compute units in two cores (1 unit is one 2Ghz processor) and
1.7Gb of RAM.

Swapping seems to be minimal.

Note that the ANALYZE is from my slow query logger, so the numbers
don't match the time the uncached query took.

There are 118K rows in this select.  It is possible the sort is the
issue, but that's why I have 20M working memory.  Do I really need
more than that?

Slow query: (16.852746963501) [0] SELECT id FROM "source_listings"
WHERE (post_time BETWEEN '2009-07-02 14:19:29.520886' AND '2009-07-11
14:19:29.520930' AND ((geo_lon BETWEEN 10879358 AND 10909241 AND
geo_lat BETWEEN 13229080 AND 13242719)) AND city = 'boston')  ORDER BY
post_time DESC LIMIT 108 OFFSET 0
Limit  (cost=30396.63..30396.90 rows=108 width=12) (actual
time=1044.575..1044.764 rows=108 loops=1)
  ->  Sort  (cost=30396.63..30401.47 rows=1939 width=12) (actual
time=1044.573..1044.630 rows=108 loops=1)
        Sort Key: post_time
        Sort Method:  top-N heapsort  Memory: 21kB
        ->  Bitmap Heap Scan on source_listings
(cost=23080.81..30321.44 rows=1939 width=12) (actual
time=321.111..952.704 rows=118212 loops=1)
              Recheck Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
              ->  Bitmap Index Scan on sl_city_etc
(cost=0.00..23080.33 rows=1939 width=0) (actual time=309.007..309.007
rows=118212 loops=1)
                    Index Cond: ((city = 'boston'::text) AND
(post_time >= '2009-07-02 14:19:29.520886'::timestamp without time
zone) AND (post_time <= '2009-07-11 14:19:29.52093'::timestamp without
time zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND
(geo_lon >= 10879358) AND (geo_lon <= 10909241))
Total runtime: 1045.683 ms



Even without the sort performance is poor:

cribq=# EXPLAIN ANALYZE SELECT count(id) FROM "source_listings" WHERE
(post_time BETWEEN '2009-07-02 14:19:29.520886' AND '2009-07-11
14:19:29.520930' AND ((geo_lon BETWEEN 10879358 AND 10909241 AND
geo_lat BETWEEN 13229080 AND 13242719)) AND city = 'boston');

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=30326.29..30326.30 rows=1 width=4) (actual
time=847.967..847.968 rows=1 loops=1)
   ->  Bitmap Heap Scan on source_listings  (cost=23080.81..30321.44
rows=1939 width=4) (actual time=219.505..769.878 rows=118212 loops=1)
         Recheck Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
         ->  Bitmap Index Scan on sl_city_etc  (cost=0.00..23080.33
rows=1939 width=0) (actual time=206.981..206.981 rows=118212 loops=1)
               Index Cond: ((city = 'boston'::text) AND (post_time >=
'2009-07-02 14:19:29.520886'::timestamp without time zone) AND
(post_time <= '2009-07-11 14:19:29.52093'::timestamp without time
zone) AND (geo_lat >= 13229080) AND (geo_lat <= 13242719) AND (geo_lon
>= 10879358) AND (geo_lon <= 10909241))
 Total runtime: 848.816 ms

Re: Poor query performance

From
Scott Marlowe
Date:
On Thu, Jul 9, 2009 at 3:34 PM, Alex<alex@liivid.com> wrote:
> Below is a query that takes 16 seconds on the first run.  I am having
> generally poor performance for queries in uncached areas of the data
> and often mediocre (500ms-2s+) performance generallly, although
> sometimes it's very fast.  All the queries are pretty similar and use
> the indexes this way.
>
> I've been trying to tune this thing with little luck.  There are about
> 1.5M records.  It's using the index properly.  Settings are:
> work_mem=20MB, shared_buffers=128MB, effective_cache_size=1024MB.
>
> I have run ANALYZE and VACUUM FULL recently.

Note that vacuum full can bloat indexes, good idea to reindex after a
vacuum full.

Trigger on column

From
"ramasubramanian"
Date:
Dear all,
    Can we create a trigger on particular column of a table?

Regards,
Ram


Re: Trigger on column

From
"A. Kretschmer"
Date:
In response to ramasubramanian :
> Dear all,
>    Can we create a trigger on particular column of a table?

No, but you can compare OLD.column and NEW.column and return from the
function if NEW.column = OLD.column.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Performance of quer or procedure going down when we are taking the backup

From
"ramasubramanian"
Date:
    Dear all,
        Performance of query or procedure going down when we are taking the
backup of that schema(it is obvious), But how to  increase the performance.

Regards,
Ram.



On Mon, Jul 20, 2009 at 6:15 AM,
ramasubramanian<ramasubramanian.g@renaissance-it.com> wrote:
>   Dear all,
>       Performance of query or procedure going down when we are taking the
> backup of that schema(it is obvious), But how to  increase the performance.
>
> Regards,
> Ram.

You're going to need to provide an awful lot more details than this if
you want to have much chance of getting a useful answer, I think.  At
a high level, you want to find out which part your system is the
bottleneck and then look for ways to remove the bottleneck.

...Robert