Martijn van Oosterhout <kleptog@svana.org> writes:
> But I know something that postgres doesn't. The data is clustered somewhat
> around the id we're comparing on. There is a "runlength" involved. Thus,
> when doing an index scan, once it has read in the first tuple of a run there
> will be more just sitting in the cache at basically zero cost.
Hm. There is code in development sources that will notice data
clustering --- actually what it looks at is the correlation between
physical tuple order and logical order of the values. I'm not sure
whether it would do the right thing if you have runs of identical keys
but no overall ordering of the keys, however.
> Currently I work around this by fiddling enable_seqscan is strategic places
> but that's blunt instrument.
Yup, it sure is. Can you propose a statistical measurement that would
cope with this situation?
regards, tom lane