Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, May 7, 2014 at 3:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If we believe that 25% of shared_buffers worth of heap blocks would
>> flush the cache doing a SeqScan, why should we allow 400% of
>> shared_buffers worth of index blocks?
> I think you're comparing apples and oranges. The 25% threshold is
> answering the question "How big does a sequential scan have to be
> before it's likely to flush so much so much unrelated data out of
> shared_buffers that it hurts the performance of other things running
> on the system?". So it's not really about whether or not things will
> *fit* in the cache, but rather a judgement about at what point caching
> that stuff is going to be less value than continuing to cache other
> things. Also, it's specifically a judgement about shared_buffers, not
> system memory.
> But effective_cache_size is used to estimate the likelihood that an
> index scan which accesses the same heap or index block twice will
> still be in cache on the second hit, and thus need to be faulted in
> only once. So this *is* a judgment about what will fit - generally
> over a very short time scale. And, since bringing a page into
> shared_buffers from the OS cache is much less expensive than bringing
> a page into memory from disk, it's really about what will fit in
> overall system memory, not just shared_buffers.
Another point is that the 25% seqscan threshold actually controls some
specific caching decisions, which effective_cache_size does not. Raising
effective_cache_size "too high" is unlikely to result in cache trashing;
in fact I'd guess the opposite. What that would do is cause the planner
to prefer indexscans over seqscans in more cases involving large tables.
But if you've got a table+index that's bigger than RAM, seqscans are
probably going to be worse for the OS cache than indexscans, because
they're going to require bringing in more data.
So I still think this whole argument is founded on shaky hypotheses
with a complete lack of hard data showing that a smaller default for
effective_cache_size would be better. The evidence we have points
in the other direction.
regards, tom lane