Home > mailing lists

Why should such a simple query over indexed columns be so slow? - Mailing list pgsql-performance

From	Alessandro Gagliardi
Subject	Why should such a simple query over indexed columns be so slow?
Date	January 30, 2012 15:13:28
Msg-id	CAAB3BBLmzQvP0rREYJveHo=3OO8zOJJ7eL7pW5StuZKe9kVC-g@mail.gmail.com Whole thread
Responses	Re: Why should such a simple query over indexed columns be so slow?
List	pgsql-performance

Tree view

So, here's the query:

SELECT private, COUNT(block_id) FROM blocks WHERE created > 'yesterday' AND shared IS FALSE GROUP BY private

What confuses me is that though this is a largish table (millions of rows) with constant writes, the query is over indexed columns of types timestamp and boolean so I would expect it to be very fast. The clause where created > 'yesterday' is there mostly to speed it up, but apparently it doesn't help much.

Here's the Full Table and Index Schema:

CREATE TABLE blocks
(
block_id character(24) NOT NULL,
user_id character(24) NOT NULL,
created timestamp with time zone,
locale character varying,
shared boolean,
private boolean,
moment_type character varying NOT NULL,
user_agent character varying,
inserted timestamp without time zone NOT NULL DEFAULT now(),
networks character varying[],
lnglat point,
CONSTRAINT blocks_pkey PRIMARY KEY (block_id )
)

WITH (
OIDS=FALSE
);

CREATE INDEX blocks_created_idx
ON blocks
USING btree
(created DESC NULLS LAST);

CREATE INDEX blocks_lnglat_idx
ON blocks
USING gist
(lnglat );

CREATE INDEX blocks_networks_idx
ON blocks
USING btree
(networks );

CREATE INDEX blocks_private_idx
ON blocks
USING btree
(private );

CREATE INDEX blocks_shared_idx
ON blocks
USING btree
(shared );

Here's the results from EXPLAIN ANALYZE:

"HashAggregate (cost=156619.01..156619.02 rows=2 width=26) (actual time=43131.154..43131.156 rows=2 loops=1)"
" -> Seq Scan on blocks (cost=0.00..156146.14 rows=472871 width=26) (actual time=274.881..42124.505 rows=562888 loops=1)"
" Filter: ((shared IS FALSE) AND (created > '2012-01-29 00:00:00+00'::timestamp with time zone))"
"Total runtime: 43131.221 ms"

I'm using Postgres version: 9.0.5 (courtesy of Heroku)

As for History: I've only recently started using this query, so there really isn't any.

As for Hardware: I'm using Heroku's "Ronin" setup which involves 1.7 GB Cache. Beyond that I don't really know.

As for Maintenance Setup: I let Heroku handle that, so I again, I don't really know. FWIW though, vacuuming should not really be an issue (as I understand it) since I don't really do any updates or deletions. It's pretty much all inserts and selects.

As for WAL Configuration: I'm afraid I don't even know what that is. The query is normally run from a Python web server though the above explain was run using pgAdmin3, though I doubt that's relevant.

As for GUC Settings: Again, I don't know what this is. Whatever Heroku defaults to is what I'm using.

Thank you in advance!

-Alessandro Gagliardi

pgsql-performance by date:

From: Andy Colson
Date: 30 January 2012, 14:33:19
Subject: Re: How to improve insert speed with index on text column

From: Claudio Freire
Date: 30 January 2012, 15:25:10
Subject: Re: Why should such a simple query over indexed columns be so slow?

Why should such a simple query over indexed columns be so slow? - Mailing list pgsql-performance

Previous

Next