Christopher Browne wrote:
> Increasing the number of cache buffers _is_ likely to lead to some
> slowdowns:
>
> - Data that passes through the cache also passes through kernel
> cache, so it's recorded twice, and read twice...
Even worse, memory that's used for the PG cache is memory that's not
available to the kernel's page cache. Even if the overall memory
usage in the system isn't enough to cause some paging to disk, most
modern kernels will adjust the page/disk cache size dynamically to fit
the memory demands of the system, which in this case means it'll be
smaller if running programs need more memory for their own use.
This is why I sometimes wonder whether or not it would be a win to use
mmap() to access the data and index files -- doing so under a truly
modern OS would surely at the very least save a buffer copy (from the
page/disk cache to program memory) because the OS could instead
direcly map the buffer cache pages directly to the program's memory
space.
Since PG often has to have multiple files open at the same time, and
in a production database many of those files will be rather large, PG
would have to limit the size of the mmap()ed region on 32-bit
platforms, which means that things like the order of mmap() operations
to access various parts of the file can become just as important in
the mmap()ed case as it is in the read()/write() case (if not more
so!). I would imagine that the use of mmap() on a 64-bit platform
would be a much, much larger win because PG would most likely be able
to mmap() entire files and let the OS work out how to order disk reads
and writes.
The biggest problem as I see it is that (I think) mmap() would have to
be made to cooperate with malloc() for virtual address space. I
suspect issues like this have already been worked out by others,
however...
--
Kevin Brown kevin@sysexperts.com