performance problem with LIMIT (order BY in DESC order). Wrong index used? - Mailing list pgsql-performance

From Dieter Rehbein
Subject performance problem with LIMIT (order BY in DESC order). Wrong index used?
Date
Msg-id 152FFD9B-A288-4329-9C96-0FF0C6A2D418@skiline.cc
Whole thread Raw
Responses Re: performance problem with LIMIT (order BY in DESC order). Wrong index used?  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
Hi everybody,

I have a performance-problem with a query using a LIMIT. There are other threads rergading performance issues with
LIMIT,but I didn't find useful hints for our problem and it might 
be interesting for other postgres-users.


There are only 2 simple tables:

CREATE TABLE newsfeed
(
id             varchar(32) PRIMARY KEY,
version        int4        NOT NULL,
newsfeed_type  varchar(20) NOT NULL,
new_item_count int4        NOT NULL
);
CREATE  INDEX IDX_NEWSFEED_TYPE ON newsfeed (newsfeed_type);


CREATE TABLE newsfeed_item
(
id            varchar(32)   PRIMARY NOT NULL,
item_type     varchar(35)   NOT NULL,
version       int4          NOT NULL,
category      varchar(25)   NULL,
data1         bytea         NULL,
data2         bytea         NULL,
date_time     timestamp     NOT NULL,
guid1         varchar(32)   NULL,
guid2         varchar(32)   NULL,
guid3         varchar(32)   NULL,
id1           int8          NULL,
id2           int8          NULL,
long_value1   int8          NULL,
long_value2   int8          NULL,
long_value3   int8          NULL,
string_value1 varchar(4000) NULL,
string_value2 varchar(500)  NULL,
string_value3 varchar(500)  NULL,
string_value4 varchar(500)  NULL,
string_value5 varchar(500)  NULL,
string_value6 varchar(500)  NULL,
newsfeed      varchar(32)   NOT NULL
);
CREATE UNIQUE INDEX newsfeed_item_pkey ON newsfeed_item (id);
CREATE  INDEX idx_nfi_guid1 ON newsfeed_item (guid1);
CREATE  INDEX idx_nfi_guid2 ON newsfeed_item (guid2);
CREATE  INDEX idx_nfi_guid3 ON newsfeed_item (guid3);
CREATE  INDEX idx_nfi_id1 ON newsfeed_item (id1);
CREATE  INDEX idx_nfi_id2 ON newsfeed_item (id2);
CREATE  INDEX idx_nfi_newsfeed ON newsfeed_item (newsfeed);
CREATE  INDEX idx_nfi_type ON newsfeed_item (item_type);
CREATE  INDEX idx_nfi_datetime ON newsfeed_item (date_time);

newsfeed contains 457036 rows
newsweed_item contains 5169727 rows

postgres version: 9.0.2
OS: CentOS release 5.5 (Final)


The following query took 4.2 seconds:

-------------------------
select *
from newsfeed_item
where newsfeed in
          (
          '173ee4dcec0d11de9f4f12313c0018c1','10dabde0f70211df816612313b02054e',

'17841c9af70211df874b12313b02054e','1783fce2f70211df814412313b02054e','1783fdd2f70211df8c1d12313b02054e','178405a2f70211df829212313b02054e',

'178440c6f70211df97c812313b02054e','178416e6f70211dfac3412313b02054e','1783e4aaf70211df9acd12313b02054e','178437e8f70211df8b8512313b02054e',
          '1783f54ef70211df81e012313b02054e','178415c4f70211df8f8112313b02054e'
          )
order by date_time desc

limit 25
-------------------------

If the LIMIT was removed, the query took 60 milliseconds!  If the sorting order was changed to ASC, the query took
44ms,even with the LIMIT. 

Then I tried to create the index on date_time in DESC order (because the result is sorted in descending order), but
thatdid not change anything.   


Then I removed the index on date_time with the following results:

query with the limit:     40 ms
query without the limit:  60 ms

=> the optimizer seems to use a wrong index (I did perform an ANALYZE on newsfeed_item and a REINDEX before I did the
test).Since I currently don't need  
the index on date_time (but will need it in the near future), I removed the index on date_time, which is ok for now.

------------------------

here are the explain analyze results:

1) the query in descending order with the limit and index on date_time (the slow one):

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=48.592..4060.779 rows=25 loops=1)
  ->  Index Scan Backward using "IDX_NFI_DATETIME" on newsfeed_item  (cost=0.00..409365.16 rows=10442 width=963)
(actualtime=48.581..4060.542 rows=25 loops=1) 
        Filter: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 4060.959 ms


2) the query in descending order without the limit (which is much faster):

Sort  (cost=39575.23..39601.33 rows=10442 width=963) (actual time=15.014..17.038 rows=477 loops=1)
  Sort Key: date_time
  Sort Method:  quicksort  Memory: 287kB
  ->  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.644..12.601
rows=477loops=1) 
        Recheck Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
        ->  Bitmap Index Scan on idx_nfi_newsfeed  (cost=0.00..418.80 rows=10442 width=0) (actual time=0.555..0.555
rows=477loops=1) 
              Index Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 19.065 ms

3) the query in ascending order with the limit (which is fast):

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=0.261..3.704 rows=25 loops=1)
  ->  Index Scan using "IDX_NFI_DATETIME" on newsfeed_item  (cost=0.00..409365.16 rows=10442 width=963) (actual
time=0.250..3.495rows=25 loops=1) 
        Filter: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
Total runtime: 3.854 ms


4) The query after removing the index on date_time, in descending order with the LIMIT (which is fast as well).

Limit  (cost=34745.39..34745.45 rows=25 width=963) (actual time=12.855..13.143 rows=25 loops=1)
  ->  Sort  (cost=34745.39..34771.49 rows=10442 width=963) (actual time=12.846..12.946 rows=25 loops=1)
        Sort Key: date_time
        Sort Method:  top-N heapsort  Memory: 40kB
        ->  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.622..9.936
rows=477loops=1) 
              Recheck Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))
              ->  Bitmap Index Scan on idx_nfi_newsfeed  (cost=0.00..418.80 rows=10442 width=0) (actual
time=0.543..0.543rows=477 loops=1) 
                    Index Cond: ((newsfeed)::text = ANY
('{173ee4dcec0d11de9f4f12313c0018c1,10dabde0f70211df816612313b02054e,17841c9af70211df874b12313b02054e,1783fce2f70211df814412313b02054e,1783fdd2f70211df8c1d12313b02054e,178405a2f70211df829212313b02054e,178440c6f70211df97c812313b02054e,178416e6f70211dfac3412313b02054e,1783e4aaf70211df9acd12313b02054e,178437e8f70211df8b8512313b02054e,1783f54ef70211df81e012313b02054e,178415c4f70211df8f8112313b02054e}'::text[]))

Total runtime: 13.318 ms

Is there anything I can do to add the index on date_time without the performance problem?

regards
Dieter


pgsql-performance by date:

Previous
From: Jesper Krogh
Date:
Subject: Re: Linux: more cores = less concurrency.
Next
From: Scott Marlowe
Date:
Subject: Re: Linux: more cores = less concurrency.