Re: First set of OSDL Shared Mem scalability results, some wierdness ... - Mailing list pgsql-performance

From Kevin Brown
Subject Re: First set of OSDL Shared Mem scalability results, some wierdness ...
Date
Msg-id 20041009112048.GA665@filer
Whole thread Raw
In response to Re: First set of OSDL Shared Mem scalability results, some wierdness ...  (Christopher Browne <cbbrowne@acm.org>)
Responses Re: First set of OSDL Shared Mem scalability results, some wierdness ...
Re: First set of OSDL Shared Mem scalability results, some
List pgsql-performance
Christopher Browne wrote:
> Increasing the number of cache buffers _is_ likely to lead to some
> slowdowns:
>
>  - Data that passes through the cache also passes through kernel
>    cache, so it's recorded twice, and read twice...

Even worse, memory that's used for the PG cache is memory that's not
available to the kernel's page cache.  Even if the overall memory
usage in the system isn't enough to cause some paging to disk, most
modern kernels will adjust the page/disk cache size dynamically to fit
the memory demands of the system, which in this case means it'll be
smaller if running programs need more memory for their own use.

This is why I sometimes wonder whether or not it would be a win to use
mmap() to access the data and index files -- doing so under a truly
modern OS would surely at the very least save a buffer copy (from the
page/disk cache to program memory) because the OS could instead
direcly map the buffer cache pages directly to the program's memory
space.

Since PG often has to have multiple files open at the same time, and
in a production database many of those files will be rather large, PG
would have to limit the size of the mmap()ed region on 32-bit
platforms, which means that things like the order of mmap() operations
to access various parts of the file can become just as important in
the mmap()ed case as it is in the read()/write() case (if not more
so!).  I would imagine that the use of mmap() on a 64-bit platform
would be a much, much larger win because PG would most likely be able
to mmap() entire files and let the OS work out how to order disk reads
and writes.

The biggest problem as I see it is that (I think) mmap() would have to
be made to cooperate with malloc() for virtual address space.  I
suspect issues like this have already been worked out by others,
however...



--
Kevin Brown                          kevin@sysexperts.com

pgsql-performance by date:

Previous
From: Christopher Browne
Date:
Subject: Re: First set of OSDL Shared Mem scalability results, some wierdness ...
Next
From: Matthew
Date:
Subject: Re: First set of OSDL Shared Mem scalability results, some