Re: Streaming base backups - Mailing list pgsql-hackers

From Garick Hamlin
Subject Re: Streaming base backups
Date
Msg-id 20110111155502.GA14939@isc.upenn.edu
Whole thread Raw
In response to Re: Streaming base backups  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Streaming base backups  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
List pgsql-hackers
On Mon, Jan 10, 2011 at 09:09:28AM -0500, Magnus Hagander wrote:
> On Sun, Jan 9, 2011 at 23:33, Cédric Villemain
> <cedric.villemain.debian@gmail.com> wrote:
> > 2011/1/7 Magnus Hagander <magnus@hagander.net>:
> >> On Fri, Jan 7, 2011 at 01:47, Cédric Villemain
> >> <cedric.villemain.debian@gmail.com> wrote:
> >>> 2011/1/5 Magnus Hagander <magnus@hagander.net>:
> >>>> On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> >>>>> Magnus Hagander <magnus@hagander.net> writes:
> >>>>>> * Stefan mentiond it might be useful to put some
> >>>>>> posix_fadvise(POSIX_FADV_DONTNEED)
> >>>>>>   in the process that streams all the files out. Seems useful, as long as that
> >>>>>>   doesn't kick them out of the cache *completely*, for other backends as well.
> >>>>>>   Do we know if that is the case?
> >>>>>
> >>>>> Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
> >>>>> not already in SHM?
> >>>>
> >>>> I think that's way more complex than we want to go here.
> >>>>
> >>>
> >>> DONTNEED will remove the block from OS buffer everytime.
> >>
> >> Then we definitely don't want to use it - because some other backend
> >> might well want the file. Better leave it up to the standard logic in
> >> the kernel.
> >
> > Looking at the patch, it is (very) easy to add the support for that in
> > basebackup.c
> > That supposed allowing mincore(), so mmap(), and so probably switch
> > the fopen() to an open() (or add an open() just for mmap
> > requirement...)
> >
> > Let's go ?
> 
> Per above, I still don't think we *should* do this. We don't want to
> kick things out of the cache underneath other backends, and since we
> can't control that. Either way, it shouldn't happen in the beginning,
> and if it does, should be backed with proper benchmarks.

Another option that occurs to me is an option to use direct IO (or another
means as needed) to bypass the cache.  So rather than kicking it out of 
the cache, we attempt just not to pollute the cache by bypassing it for cold
pages and use either normal io for 'hot pages', or use a 'read()' to "heat" 
the cache afterward.

Garick

> 
> I've committed the backend side of this, without that. Still working
> on the client, and on cleaning up Heikki's patch for grammar/parser
> support.
> 
> -- 
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


pgsql-hackers by date:

Previous
From: Joel Jacobson
Date:
Subject: pg_depend explained
Next
From: Dimitri Fontaine
Date:
Subject: Re: Add function dependencies