Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one - Mailing list pgsql-committers

From Simon Riggs
Subject Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one
Date
Msg-id 1205878084.4285.354.camel@ebony.site
Whole thread Raw
In response to Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-committers
On Tue, 2008-03-18 at 16:56 -0400, Bruce Momjian wrote:
> Gregory Stark wrote:
> > "Bruce Momjian" <bruce@momjian.us> writes:
> >
> > >> > > On Tue, 2008-03-18 at 03:59 +0000, Bruce Momjian wrote:
> > >> > > > * Speed WAL recovery by allowing more than one page to be prefetched
> > >> > > >
> > >> > > >   This involves having a separate process that can be told which pages
> > >> > > >   the recovery process will need in the near future.
> > >
> > > Are you reading the same thread I am?  See:
> > >
> > >     http://archives.postgresql.org/pgsql-hackers/2008-02/msg01301.php
> >
> > I don't think there's any consensus for the approach you describe above. If
> > anything it seemed the least objectionable form was something involving
> > posix_fadvise or libaio.
> >
> > Tom did wave us off from Simon's approach on the basis of it being hard to
> > test and Heikki seemed to be agreeing on the basis that it would be better to
> > reuse infrastructure useful in other cases as well. So I guess that's some
> > kind of consensus... of two.
>
> Yep, that was my analysis too.

It may surprise you but I didn't read Tom's words as being against
"Simon's approach". Personally I read them as a generic warning, which I
agreed with. Maybe Tom can straighten that out.

If you know what "my approach" is, that's good 'cos I'm not sure I do
yet. I said at FOSDEM 2 weeks after this thread that "Multiple slave
processes handle database blocks, based upon hash distribution of
blocks".

We're all agreed that we need to parallelise the work. Somehow. Is it
just the I/O we need to parallelise? Are we sure about that?

Nobody has shown any convincing evidence in favour of, or against,
various flavours of async I/O. In the absence of that I think the
simplest way is normal I/O, with many processes executing it. Maybe I
misread the Developer's FAQ describing why we don't already use async
I/O or other "wizz-bang" features? I'm optimistic about that actually,
but lets see the facts before we take that decision.

So AFAICS I have advocated the less bold approach.

Nobody has even mentioned yet the bgwriter and whether it should be
active during recovery and its possible role in smoothing
restartpointing.

In any case, all I've said here is that we shouldn't put a specific
approach into the TODO. Just state the problem.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pgsql: Enable probes to work with Mac OS X Leopard and other OSes that
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Don't need -Wno-error anymore, because flex is no longer