I'm kind of curious where the value of 4 for random_page_cost came from.
IIRC, Tom, you mentioned it came from tests you ran -- were those raw i/o
tests or Postgres cost estimates compared to execution times?
Te reason I'm wondering about this is it seems out of line with raw i/o
numbers. Typical values for consumer drives are about a sustained throughput
of 60MB/s ( Ie .2ms per 8k) and seek latency of 4ms. That gives a ratio of 20.
Server-class drives have even a ratio since a 15kRPM drive can have a
sustained bandwidth of 110-170 MB/s (48us-75us) and an average seek latency of
2ms giving a ratio of 27-42. And of course that doesn't include the effects of
a RAID array which magnifies that ratio.
I'm concerned that if we start recommending such large random_page_costs as
these it'll produce plans that are very different from what people have grown
used to. And the value of 4 seems to work well in practice.
I suspect the root of all this is that random_page_cost is encoding within it
the effects of caching. If that's true shouldn't we find a way to model
caching using effective_cache_size instead so that people can set
random_page_cost realistically based on their hardware?
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!