On 2014-01-17 16:18:49 -0800, Greg Stark wrote:
> On Fri, Jan 17, 2014 at 9:14 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > The scheme that'd allow us is the following:
> > When postgres reads a data page, it will continue to first look up the
> > page in its shared buffers, if it's not there, it will perform a page
> > cache backed read, but instruct that read to immediately remove from the
> > page cache afterwards (new API or, posix_fadvise() or whatever). As long
> > as it's in shared_buffers, postgres will not need to issue new reads, so
> > there's no no benefit keeping it in the page cache.
> > If the page is dirtied, it will be written out normally telling the
> > kernel to forget about the caching the page (using 3) or possibly direct
> > io).
> > When a page in postgres's buffers (which wouldn't be set to very large
> > values) isn't needed anymore and *not* dirty, it will seed the kernel
> > page cache with the current data.
>
> This seems like mostly an exact reimplementation of DIRECT_IO
> semantics using posix_fadvise.
Not at all. The significant bits are a) the kernel uses the pagecache
for writeback of dirty data and more importantly b) uses the pagecache
to cache data not in postgres's s_b.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services