Greetings,
* Chris Withers (chris@withers.org) wrote:
> We have an app that deals with a lot of queries, and we've been slowly
> seeing performance issues emerge. We take a lot of free form queries from
> users and stumbled upon a very surprising optimisation.
>
> So, we have a 'state' column which is a 3 character string column with an
> index on it. Despite being a string, this column is only used to store one
> of three values: 'NEW', 'ACK', or 'RSV'.
Sounds like a horrible field to have an index on.
> One of our most common queries clauses is "state!='RSV'" and we've found
> that by substituting this clause with "state='ACK' or state='NEW'" wherever
> it was used, we've dropped the postgres server's load average from 20 down
> to 4 and the CPU usage from 60% in user space down to <5%.
You've changed the question you're asking the database. PG doesn't
*know* that there's only those three values, but it probably has a
pretty good guess about how many records are ACK and how many are NEW
thanks to those being in the MCV list.
> This seems counter-intuitive to me, so thought I'd ask here. Why would this
> be likely to make such a difference? We're currently on 9.4, is this
> something that's likely to be different (better? worse?) if we got all the
> way up to 10 or 11?
When you change what you're asking, PG is going to change how it gives
you the answer and sometimes that'll be faster and other times it won't
be.
Really though, if you want something more than wild speculation, posting
the 'explain analyze' of each query along with the actual table
definitions and sizes and such would be the best way to get it.
I'd suggest you check out the wiki article written about this kind of
question:
https://wiki.postgresql.org/wiki/Slow_Query_Questions
Thanks!
Stephen