Mark Aufflick <mark@pumptheory.com> writes:
> Obviously Q2 is faster than Q1,
That's not really obvious at all. If there are lots of records being returned
the index might not be faster than a sequential scan.
> My assumption is that the sequential scan is blowing the index from any cache
> it might live in, and simultaneously stealing all the disk IO that is needed to
> access the index on disk (the table has 200,000 rows).
It kind of sounds to me like you've lowered random_page_cost to reflect the
fact that your indexes are nearly always completely cached. But when they're
not this unrealistic random_page_cost causes indexes to be used when they're
no longer faster.
Perhaps you should post an "EXPLAIN ANALYZE" of your Q1 and Q2 (the latter
preferable with and without enable_indexscan, but since it's a join you may
not be able to get precisely the comparable plan without just that one index
scan.)
> 2) change the way the server allocates/prioritizes different caches - i don't
> know enough about how postgres caches work to do this (if it's possible)
Postgres keeps one set of shared buffers, not separate pools . Normally you
only allocate a small amount of your memory for Postgres and let the OS handle
disk caching.
What is your shared_buffers set to and how much memory do you have?
> 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 in
> production will be hard because the above code that I am not responsible for
> has a lot of (slightly wacky) implicit date casts
I can't think of any 7.4 changes that would affect this directly, but there
were certainly plenty of changes that had broad effects. you never know.
8.0, on the other hand, has a new algorithm that specifically tries to protect
against the shared buffers being blown out by a sequential scan. But that will
only help if it's the shared buffers being thrashed that's hurting you, not
the entire OS file system cache.
--
greg