Simon Riggs <simon@2ndquadrant.com> writes:
> On Mon, 2005-10-31 at 14:14 +0100, Martijn van Oosterhout wrote:
>> On Mon, Oct 31, 2005 at 12:16:59PM +0000, Simon Riggs wrote:
>>> I'm not sure we have any good tests of that either way, do we? I'm not
>>> certain why we would trust OS cache any more than we could trust the
>>> shared buffers. But setting it too high would probably overuse backend
>>> memory for most variable query workloads.
>>
>> Well, it comes down to a thought experiment. Any disk blocks you have in
>> the shared buffers will also be in the system cache.
> Each have different and independent cache replacement...
The real point is that RAM dedicated to shared buffers can't be used for
anything else [1], whereas letting the kernel manage it gives you some
flexibility (for instance, to deal with transient large memory demands
by individual backends, or from stuff unrelated to Postgres). A system
configured to give most of RAM to shared buffers might look good on
sufficiently narrow test cases, but its performance will be horribly
brittle: it will go into swap thrashing on any small provocation. The
extra 50usec or whatever to get stuff from a kernel disk buffer instead
of our own shared buffer is a good tradeoff to get flexibility in the
amount of stuff actually buffered at any one instant.
[1] unless you are on a platform where the kernel doesn't think SysV
shared memory should be locked in RAM. In that case, what you have is a
large arena that is subject to being swapped out ... and a disk buffer
that's been swapped to disk is demonstrably worse than no buffer at all.
(Hint: count the I/Os involved, especially when the page is dirty.)
regards, tom lane