Re: posix_fadvsise in base backups - Mailing list pgsql-hackers

From Andres Freund
Subject Re: posix_fadvsise in base backups
Date
Msg-id 201109241826.11155.andres@anarazel.de
Whole thread Raw
In response to Re: posix_fadvsise in base backups  (Magnus Hagander <magnus@hagander.net>)
Responses Re: posix_fadvsise in base backups
List pgsql-hackers
Hi,

On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
> On Sat, Sep 24, 2011 at 17:14, Andres Freund <andres@anarazel.de> wrote:
> > On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
> >> Attached patch adds a simple call to posix_fadvise with
> >> POSIX_FADV_DONTNEED on all the files being read when doing a base
> >> backup, to help the kernel not to trash the filesystem cache.
> >> Seems like a simple enough fix - in fact, I don't remember why I took
> >> it out of the original patch :O
> >> Any reason not to put this in? Is it even safe enough to put into 9.1
> >> (probably not, but maybe?)
> > Won't that possibly throw a formerly fully cached database out of the
> > cache?
> I was assuming the kernel was smart enough to read this as "*this*
> process is not going to be using this file anymore", not "nobody in
> the whole machine is going to use this file anymore". And the process
> running the base backup is certainly not going to read it again.
> But that's a good point - do you know if that is the case, or does it
> mandate more testing?
I am pretty but not totally sure that the kernel does not track each process 
that uses a page. For one doing so would probably prohibitively expensive. For 
another I am pretty (but not ...) sure that I restructured an application not 
to fadvise(DONTNEED) memory that is also used in other processes.

Currently I can only think of to workarounds, both os specific:
- Use O_DIRECT for reading the base backup. Will be slow in fully cached 
situations, but should work ok enough in all others. Need to be carefull about 
the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
- use mmap/mincore() to gather whether data is in cache and restore that state 
afterwards.

Too bad that POSIX_FADV_NOREUSE is not really implemented.


Andres


pgsql-hackers by date:

Previous
From: Kerem Kat
Date:
Subject: Re: Adding CORRESPONDING to Set Operations
Next
From: Robert Haas
Date:
Subject: Re: unite recovery.conf and postgresql.conf