Why should such a simple query over indexed columns be so slow? - Mailing list pgsql-performance

From Alessandro Gagliardi
Subject Why should such a simple query over indexed columns be so slow?
Date
Msg-id CAAB3BBLmzQvP0rREYJveHo=3OO8zOJJ7eL7pW5StuZKe9kVC-g@mail.gmail.com
Whole thread Raw
Responses Re: Why should such a simple query over indexed columns be so slow?  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
So, here's the query:

SELECT private, COUNT(block_id) FROM blocks WHERE created > 'yesterday' AND shared IS FALSE GROUP BY private

What confuses me is that though this is a largish table (millions of rows) with constant writes, the query is over indexed columns of types timestamp and boolean so I would expect it to be very fast. The clause where created > 'yesterday' is there mostly to speed it up, but apparently it doesn't help much.  

Here's the Full Table and Index Schema:

CREATE TABLE blocks
(
  block_id character(24) NOT NULL,
  user_id character(24) NOT NULL,
  created timestamp with time zone,
  locale character varying,
  shared boolean,
  private boolean,
  moment_type character varying NOT NULL,
  user_agent character varying,
  inserted timestamp without time zone NOT NULL DEFAULT now(),
  networks character varying[],
  lnglat point,
  CONSTRAINT blocks_pkey PRIMARY KEY (block_id )
)

WITH (
  OIDS=FALSE
);

CREATE INDEX blocks_created_idx
  ON blocks
  USING btree
  (created  DESC NULLS LAST);

CREATE INDEX blocks_lnglat_idx
  ON blocks
  USING gist
  (lnglat );

CREATE INDEX blocks_networks_idx
  ON blocks
  USING btree
  (networks );

CREATE INDEX blocks_private_idx
  ON blocks
  USING btree
  (private );

CREATE INDEX blocks_shared_idx
  ON blocks
  USING btree
  (shared );

Here's the results from EXPLAIN ANALYZE:

"HashAggregate  (cost=156619.01..156619.02 rows=2 width=26) (actual time=43131.154..43131.156 rows=2 loops=1)"
"  ->  Seq Scan on blocks  (cost=0.00..156146.14 rows=472871 width=26) (actual time=274.881..42124.505 rows=562888 loops=1)"
"        Filter: ((shared IS FALSE) AND (created > '2012-01-29 00:00:00+00'::timestamp with time zone))"
"Total runtime: 43131.221 ms"

I'm using Postgres version: 9.0.5 (courtesy of Heroku)

As for History: I've only recently started using this query, so there really isn't any.

As for Hardware: I'm using Heroku's "Ronin" setup which involves 1.7 GB Cache. Beyond that I don't really know.

As for Maintenance Setup: I let Heroku handle that, so I again, I don't really know. FWIW though, vacuuming should not really be an issue (as I understand it) since I don't really do any updates or deletions. It's pretty much all inserts and selects.

As for WAL Configuration: I'm afraid I don't even know what that is. The query is normally run from a Python web server though the above explain was run using pgAdmin3, though I doubt that's relevant.

As for GUC Settings: Again, I don't know what this is. Whatever Heroku defaults to is what I'm using.

Thank you in advance!
-Alessandro Gagliardi

pgsql-performance by date:

Previous
From: Andy Colson
Date:
Subject: Re: How to improve insert speed with index on text column
Next
From: Claudio Freire
Date:
Subject: Re: Why should such a simple query over indexed columns be so slow?