Thread: Simple select, but takes long time
Hi,
I am having a really hard time trying to figure out why my simple count(*) query is taking so long. I have a table with 1,296,070 rows in it. There are 2 different types of information that each row has that I care about:
status : character(1)
source_id : bigint
Then, I have the following index on the table:
“this_index” (status, source_id, <another_column>)
Now when I do the following select, it takes a really long time:
stingray_4_4_d=# explain analyze select count(*) from listings where insert_status = '1' and data_source_id = 52;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=15222.83..15222.84 rows=1 width=0) (actual time=87050.129..87050.130 rows=1 loops=1)
-> Index Scan using listing_status_idx on listings (cost=0.00..15211.20 rows=4649 width=0) (actual time=31.118..87031.776 rows=17209 loops=1)
Index Cond: ((insert_status = '1'::bpchar) AND (data_source_id = 52))
Total runtime: 87050.213 ms
(4 rows)
I actually have the same exact data over on a Mysql box, with the same exact index, and that runs in 0.10 seconds.
Clearly something is wrong. Here are a couple of the parameters I have set on my Postgres box:
stingray_4_4_d=# show shared_buffers;
shared_buffers
----------------
1900MB
(1 row)
stingray_4_4_d=# show max_fsm_pages;
max_fsm_pages
---------------
5000000
(1 row)
Any help would be much appreciated. This is really frustrating. Thanks.
"James DeMichele" <James.DeMichele@redfin.com> writes: > I am having a really hard time trying to figure out why my simple > count(*) query is taking so long. I have a table with 1,296,070 rows in > it. There are 2 different types of information that each row has that I > care about: Hmm, the EXPLAIN output works out to about 5 msec per row, which is not too out of line for a lot of random-access disk fetches. I'm surprised the planner bothered with an indexscan for this --- I'd bet a seqscan might be faster, seeing you're having to read about 1% of the rows which will likely touch most pages of the table anyway. Or a bitmap indexscan might be even better. What do you get if you try the EXPLAIN ANALYZE with enable_indexscan = off? regards, tom lane
"James DeMichele" <James.DeMichele@redfin.com> wrote .. > Then, I have the following index on the table: > "this_index" (status, source_id, <another_column>) If you have many queries of this type, do CLUSTER this_index ON tablename; and retry the SELECT. MySQL is using some sort of auto-clustering ISAM on the other box mayhaps? In Postgres you will have to re-CLUSTER periodically if there are INSERTs and UPDATEs.
andrew@pillette.com writes: > MySQL is using some sort of auto-clustering ISAM on the other box mayhaps? I think they probably count the index entries without ever visiting the table proper. Works great for mainly-static data ... regards, tom lane