> > True, it is a cost/benefit issue. My assumption was that once we have
> > free-behind in the PostgreSQL shared buffer cache, the kernel cache
> > issues would be minimal, but I am willing to be found wrong.
>
> If you are running on the
> small-shared-buffers-and-large-kernel-cache theory, then getting the
> kernel cache to behave right is much more important than making the
> PG cache behave right. If you favor the other theory then
> free-behind in the PG cache is the important thing. However, I've
> not yet seen any convincing evidence that large PG cache with small
> kernel cache is the way to go.
Nor could it ever be a win unless the cache was populated via
O_DIRECT, actually. Big PG cache == 2 extra copies of data, once in
the kernel and once in PG. Doing caching at the kernel level, however
means only one copy of data (for the most part). Only problem with
this being that it's not always that easy or an option to reconfig a
kernel to have a bigger FS cache. That said, tripple copying a chunk
of mem is generally faster than even a single disk read. If
PostgreSQL ever wanted to have a platform agnostic way of doing
efficient caching, it'd likely have to be in the userland and would
require the use of O_DIRECT.
-sc
PS Tripple copy == disk buffer into kernel (data is normally DMA'ed, not technically a copy), fs cache into user land,
userlandinto PG cache, pg cache into application. O_DIRECT eliminates one of these copies: nevermind the doubling up
ofdata in RAM.
--
Sean Chittenden