Thread: Simple select, but takes long time

Simple select, but takes long time

From
"James DeMichele"
Date:

Hi,

I am having a really hard time trying to figure out why my simple count(*) query is taking so long. I have a table with 1,296,070 rows in it. There are 2 different types of information that each row has that I care about:

 

status : character(1)

source_id : bigint

 

Then, I have the following index on the table:

 

“this_index” (status, source_id, <another_column>)

 

Now when I do the following select, it takes a really long time:

 

stingray_4_4_d=# explain analyze select count(*) from listings where insert_status = '1' and data_source_id = 52;

                                                                    QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------

 Aggregate  (cost=15222.83..15222.84 rows=1 width=0) (actual time=87050.129..87050.130 rows=1 loops=1)

   ->  Index Scan using listing_status_idx on listings  (cost=0.00..15211.20 rows=4649 width=0) (actual time=31.118..87031.776 rows=17209 loops=1)

         Index Cond: ((insert_status = '1'::bpchar) AND (data_source_id = 52))

 Total runtime: 87050.213 ms

(4 rows)

 

I actually have the same exact data over on a Mysql box, with the same exact index, and that runs in 0.10 seconds.

 

Clearly something is wrong. Here are a couple of the parameters I have set on my Postgres box:

 

stingray_4_4_d=# show shared_buffers;

 shared_buffers

----------------

 1900MB

(1 row)

 

stingray_4_4_d=# show max_fsm_pages;

 max_fsm_pages

---------------

 5000000

(1 row)

 

Any help would be much appreciated. This is really frustrating. Thanks.

Re: Simple select, but takes long time

From
Tom Lane
Date:
"James DeMichele" <James.DeMichele@redfin.com> writes:
> I am having a really hard time trying to figure out why my simple
> count(*) query is taking so long. I have a table with 1,296,070 rows in
> it. There are 2 different types of information that each row has that I
> care about:

Hmm, the EXPLAIN output works out to about 5 msec per row, which is not
too out of line for a lot of random-access disk fetches.  I'm surprised
the planner bothered with an indexscan for this --- I'd bet a seqscan
might be faster, seeing you're having to read about 1% of the rows which
will likely touch most pages of the table anyway.  Or a bitmap indexscan
might be even better.  What do you get if you try the EXPLAIN ANALYZE
with enable_indexscan = off?

            regards, tom lane

Re: Simple select, but takes long time

From
andrew@pillette.com
Date:
"James DeMichele" <James.DeMichele@redfin.com> wrote ..

> Then, I have the following index on the table:

> "this_index" (status, source_id, <another_column>)

If you have many queries of this type, do

CLUSTER this_index ON tablename;

and retry the SELECT.

MySQL is using some sort of auto-clustering ISAM on the other box mayhaps?

In Postgres you will have to re-CLUSTER periodically if there are INSERTs and UPDATEs.

Re: Simple select, but takes long time

From
Tom Lane
Date:
andrew@pillette.com writes:
> MySQL is using some sort of auto-clustering ISAM on the other box mayhaps?

I think they probably count the index entries without ever visiting the
table proper.  Works great for mainly-static data ...

            regards, tom lane