Re: Why should such a simple query over indexed columns be so slow? - Mailing list pgsql-performance

From Claudio Freire
Subject Re: Why should such a simple query over indexed columns be so slow?
Date
Msg-id CAGTBQpb-R3XvWqVN4-3FhS5zO+tsC9e9WLHmpm9GDAzuDYT+LA@mail.gmail.com
Whole thread Raw
In response to Why should such a simple query over indexed columns be so slow?  (Alessandro Gagliardi <alessandro@path.com>)
Responses Re: Why should such a simple query over indexed columns be so slow?
List pgsql-performance
On Mon, Jan 30, 2012 at 4:13 PM, Alessandro Gagliardi
<alessandro@path.com> wrote:
> So, here's the query:
>
> SELECT private, COUNT(block_id) FROM blocks WHERE created > 'yesterday' AND
> shared IS FALSE GROUP BY private
>
> What confuses me is that though this is a largish table (millions of rows)
> with constant writes, the query is over indexed columns of types timestamp
> and boolean so I would expect it to be very fast. The clause where created >
> 'yesterday' is there mostly to speed it up, but apparently it doesn't help
> much.

The number of rows touched is ~0.5M, and is correctly estimated, which
would lead me to believe PG estimates the index plan to be slower.

You could try by executing first "set enable_seqscan=false;" and then
your query with explain analyze again. You'll probably get an index
scan, and you'll see both how it performs and how PG thought it would
perform. Any mismatch between the two probably means you'll have to
change the planner tunables (the x_tuple_cost ones) to better match
your hardware.


> As for Hardware: I'm using Heroku's "Ronin" setup which involves 1.7 GB
> Cache. Beyond that I don't really know.
snip
> As for GUC Settings: Again, I don't know what this is. Whatever Heroku
> defaults to is what I'm using.

And there's your problem. Without knowing/understanding those, you
won't get anywhere. I don't know what Heroku is, but you should find
out both hardware details and PG configuration details.

> As for Maintenance Setup: I let Heroku handle that, so I again, I don't
> really know. FWIW though, vacuuming should not really be an issue (as I
> understand it) since I don't really do any updates or deletions. It's pretty
> much all inserts and selects.

Maintainance also includes analyzing the table, to gather stats that
feed the optimizer, and it's very important to keep the stats
accurate. You can do it manually - just perform an ANALYZE. However,
the plan doesn't show any serious mismatch between expected and actual
rowcounts, which suggests stats aren't your problem.

pgsql-performance by date:

Previous
From: Alessandro Gagliardi
Date:
Subject: Why should such a simple query over indexed columns be so slow?
Next
From: Jeff Janes
Date:
Subject: Re: How to improve insert speed with index on text column