Thread: PITR Phase 1 - Test results

PITR Phase 1 - Test results

From

Simon Riggs

Date:

26 April 2004, 13:01:00

I've now completed the coding of Phase 1 of PITR. 

This allows a backup to be recovered and then rolled forward (all the
way) on transaction logs. This proves the code and the design works, but
also validates a lot of the earlier assumptions that were the subject of
much earlier debate.

As noted in the previous designs, PostgreSQL talks to an external
archiver using the XLogArchive API.
I've now completed:
- changes to PostgreSQL
- written a simple archiving utility, pg_arch

Using both of these together, I have successfully:
- started pg_arch
- started postgres
- taken a backup using tar
- ran pgbench for an extended period, so that the transaction logs taken
at the start have long since been recycled
- killed postmaster
- wait for completion
- rm -R $PGDATA
- restore using tar
- restore xlogs from archive directory
- start postmaster and watch it recover to end of logs

This has been tested through a number of times on non-trivial tests and
I've sat and watch the beast at work to make sure nothing wierd was
happening on timing.

At this stage:
Missing Functions -
- recovery does NOT yet stop at a specified point-in-time (that was
always planned for Phase 2)
- few more log messages required to report progress
- debug mode required to allow most to be turned off

Wrinkles
- code is system testable, but not as cute as it could be
- input from committers is now sought to complete the work
- you are strongly advised not to treat any of the patches as usable in
any real world situation YET - that bit comes next

Bugs
- two bugs currently occur during some tests:
1. the notification mechanism as originally designed causes ALL backends
to report that a log file has closed. That works most of the time,
though does give rise to occaisional timing errors - nothing too
serious, but this inexactness could lead to later errors.
2. After restore, the notification system doesn't recover fully - this
is a straightforward one 

I'm building a full patchset for this code and will upload this soon. As
you might expect over the time its taken me to develop this, some bitrot
has set in, so I'm rebuilding it against the latest dev version now, and
will complete fixes for the two bugs mentioned above.

I'm sure some will say "no words, show me the code"... I thought you all
would appreciate some advance warning of this, to plan time to
investigate and comment upon the coding.

Best Regards, Simon Riggs, 2ndQuadrant 
http://www.2ndquadrant.com

Re: PITR Phase 1 - Test results

From

Bruce Momjian

Date:

26 April 2004, 17:31:01

Simon Riggs wrote:
> 
> Well, I guess I was fairly happy too :-)

YES!

> I'd be more comfortable if I'd found more bugs though, but I'm sure the
> kind folk on this list will see that wish of mine comes true!
> 
> The code is in a "needs more polishing" state - which is just the right
> time for some last discussions before everything sets too solid.

Once we see the patch, we will be able to eyeball all the code paths and
interface to existing code and will be able to spot a lot of stuff, I am
sure.

It might take a few passes over it but you will get all the support and
ideas we have.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR Phase 1 - Test results

From

Simon Riggs

Date:

26 April 2004, 17:36:52

Well, I guess I was fairly happy too :-)

I'd be more comfortable if I'd found more bugs though, but I'm sure the
kind folk on this list will see that wish of mine comes true!

The code is in a "needs more polishing" state - which is just the right
time for some last discussions before everything sets too solid.

Regards, Simon

On Mon, 2004-04-26 at 17:48, Bruce Momjian wrote:
> I want to come hug you --- where do you live?  !!!
> 
> :-)
> 
> ---------------------------------------------------------------------------
> 
> Simon Riggs wrote:
> > I've now completed the coding of Phase 1 of PITR. 
> > 
> > This allows a backup to be recovered and then rolled forward (all the
> > way) on transaction logs. This proves the code and the design works, but
> > also validates a lot of the earlier assumptions that were the subject of
> > much earlier debate.
> > 
> > As noted in the previous designs, PostgreSQL talks to an external
> > archiver using the XLogArchive API.
> > I've now completed:
> > - changes to PostgreSQL
> > - written a simple archiving utility, pg_arch
> > 
> > Using both of these together, I have successfully:
> > - started pg_arch
> > - started postgres
> > - taken a backup using tar
> > - ran pgbench for an extended period, so that the transaction logs taken
> > at the start have long since been recycled
> > - killed postmaster
> > - wait for completion
> > - rm -R $PGDATA
> > - restore using tar
> > - restore xlogs from archive directory
> > - start postmaster and watch it recover to end of logs
> > 
> > This has been tested through a number of times on non-trivial tests and
> > I've sat and watch the beast at work to make sure nothing wierd was
> > happening on timing.
> > 
> > At this stage:
> > Missing Functions -
> > - recovery does NOT yet stop at a specified point-in-time (that was
> > always planned for Phase 2)
> > - few more log messages required to report progress
> > - debug mode required to allow most to be turned off
> > 
> > Wrinkles
> > - code is system testable, but not as cute as it could be
> > - input from committers is now sought to complete the work
> > - you are strongly advised not to treat any of the patches as usable in
> > any real world situation YET - that bit comes next
> > 
> > Bugs
> > - two bugs currently occur during some tests:
> > 1. the notification mechanism as originally designed causes ALL backends
> > to report that a log file has closed. That works most of the time,
> > though does give rise to occaisional timing errors - nothing too
> > serious, but this inexactness could lead to later errors.
> > 2. After restore, the notification system doesn't recover fully - this
> > is a straightforward one 
> > 
> > I'm building a full patchset for this code and will upload this soon. As
> > you might expect over the time its taken me to develop this, some bitrot
> > has set in, so I'm rebuilding it against the latest dev version now, and
> > will complete fixes for the two bugs mentioned above.
> > 
> > I'm sure some will say "no words, show me the code"... I thought you all
> > would appreciate some advance warning of this, to plan time to
> > investigate and comment upon the coding.
> > 
> > Best Regards, Simon Riggs, 2ndQuadrant 
> > http://www.2ndquadrant.com
> > 
> > 
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index scan if your
> >       joining column's datatypes do not match
> >

Re: PITR Phase 1 - Test results

From

Bruce Momjian

Date:

26 April 2004, 17:45:53

I want to come hug you --- where do you live?  !!!

:-)

---------------------------------------------------------------------------

Simon Riggs wrote:
> I've now completed the coding of Phase 1 of PITR. 
> 
> This allows a backup to be recovered and then rolled forward (all the
> way) on transaction logs. This proves the code and the design works, but
> also validates a lot of the earlier assumptions that were the subject of
> much earlier debate.
> 
> As noted in the previous designs, PostgreSQL talks to an external
> archiver using the XLogArchive API.
> I've now completed:
> - changes to PostgreSQL
> - written a simple archiving utility, pg_arch
> 
> Using both of these together, I have successfully:
> - started pg_arch
> - started postgres
> - taken a backup using tar
> - ran pgbench for an extended period, so that the transaction logs taken
> at the start have long since been recycled
> - killed postmaster
> - wait for completion
> - rm -R $PGDATA
> - restore using tar
> - restore xlogs from archive directory
> - start postmaster and watch it recover to end of logs
> 
> This has been tested through a number of times on non-trivial tests and
> I've sat and watch the beast at work to make sure nothing wierd was
> happening on timing.
> 
> At this stage:
> Missing Functions -
> - recovery does NOT yet stop at a specified point-in-time (that was
> always planned for Phase 2)
> - few more log messages required to report progress
> - debug mode required to allow most to be turned off
> 
> Wrinkles
> - code is system testable, but not as cute as it could be
> - input from committers is now sought to complete the work
> - you are strongly advised not to treat any of the patches as usable in
> any real world situation YET - that bit comes next
> 
> Bugs
> - two bugs currently occur during some tests:
> 1. the notification mechanism as originally designed causes ALL backends
> to report that a log file has closed. That works most of the time,
> though does give rise to occaisional timing errors - nothing too
> serious, but this inexactness could lead to later errors.
> 2. After restore, the notification system doesn't recover fully - this
> is a straightforward one 
> 
> I'm building a full patchset for this code and will upload this soon. As
> you might expect over the time its taken me to develop this, some bitrot
> has set in, so I'm rebuilding it against the latest dev version now, and
> will complete fixes for the two bugs mentioned above.
> 
> I'm sure some will say "no words, show me the code"... I thought you all
> would appreciate some advance warning of this, to plan time to
> investigate and comment upon the coding.
> 
> Best Regards, Simon Riggs, 2ndQuadrant 
> http://www.2ndquadrant.com
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

PITR Phase 1 - Code Overview (1)

From

Simon Riggs

Date:

26 April 2004, 18:42:27

On Mon, 2004-04-26 at 16:37, Simon Riggs wrote:
> I've now completed the coding of Phase 1 of PITR.
>
> This allows a backup to be recovered and then rolled forward (all the
> way) on transaction logs. This proves the code and the design works, but
> also validates a lot of the earlier assumptions that were the subject of
> much earlier debate.
>
> As noted in the previous designs, PostgreSQL talks to an external
> archiver using the XLogArchive API.
> I've now completed:
> - changes to PostgreSQL
> - written a simple archiving utility, pg_arch
>
This will be on HACKERS not PATCHES for a while...


OVERVIEW :

Various code changes. Not all included here...but I want to prove this
is real, rather than have you waiting for my patch release skills to
improve.

PostgreSQL changes include:
============================
- guc.c
New GUC called wal_archive to control archival logging/not.

- xlog.h
GUC added here

- xlog.c
The most critical parts of the code live here. The way things currently
work can be thought of as a circular set of logs, with the current log
position sweeping around the circle like a clock. In order to archive an
xlog, you must start just AFTER the file has been closed and BEFORE the
pointer sweeps round again.
The code here tries to spot the right moment to notify the archive that
its time to archive. That point is critical, too early and the archive
may yet be incomplete, too late and a window of failure creeps into the
system.
Finding that point is more complicated than it seems because every
backend has the same file open and decides to close it at different
times - nearly the same time if you're running pgbench, but could vary
considerably otherwise. That timing difference is the source of Bug#1.
My solution is to use the piece of code that first updates pg_control,
since there is a similar need to only-do-it-once. My understanding is
that the other backends eventually discover they are supposed to be
looking at a different file now and reset themselves - so that the xlog
gets fsynced only once.
It's taken me a week to consider the alternatives...this point is
critical, so please suggest if you know/think differently.
When the pointer sweeps round again, if we are still archiving, we
simply increase the number of logs in the cycle to defer when we can
recycle the xlog. The code doesn't yet handle a failure condition we
discussed previously: running out of disk space and how we handle that
(there was detailed debate, noted for future implementation).

New utility aimed at being located in src/bin/pg_arch
=======================================================
- pg_arch.c
The idea of pg_arch is that it is a functioning archival tool and at the
same time is the reference implementation of the XLogArchive API. The
API is all wrapped up in the same file currently, to make it easier to
implement, but I envisage separating these out into two parts after it
passes initial inspection - shouldn't take too much work given that was
its design goal. This will then allow the API to be used for wider
applications that want to backup PostgreSQL.

- src/bin/Makefile has been updated to include pg_arch, so that this
then gets made as part of the full system rather than an add-on. I'm
sure somebody has feelings on this...my thinking was that it ought to be
available without too much effort.

What's NOT included (YET!)
==========================
-changes to initdb
-changes to postgresql.conf
-changes to wal_debug
-related changes
-user documentation

- changes to initdb
XLogArchive API implementation relies on the existence of
    $PGDATA/pg_rlog

That would be relatively simple to add to initdb, but its also a no
brainer to add without it, so I thought I'd leave it for discussion in
case anybody has good reasons to put elsewhere/rename it etc.

More importantly, this effects the security model used by XLogArchive.
The way I had originally envisaged this, the directory permissions would
be opened up for group level read/write thus:
    pg_xlog        rwxr-x---
    pg_rlog        rwxrwx---
though this of course relies on $PGDATA being opened up also. That then
would allow the archiving tool to be in its own account also, yet with a
shared group. (Thinking that a standard Legato install (for instance) is
unlikely to recommend sharing a UNIX userid with PostgreSQL). I was
unaware that PostgreSQL checks the permissions of PGDATA before it
starts and does not allow you to proceed if group permissions exist.

We have two options:-related changes
-user documentation

i) alter all things that rely on security being userlevel-only
- initdb
- startup
- most other security features?
ii) encourage (i.e. force) people using XLogArchive API to run as the
PostgreSQL owning-user (postgres).

I've avoided this issue in the general implementation, thinking that
there'll be some strong feelings either way, or an alternative that I
haven't thought of yet (please...)

-changes to postgresql.conf
The parameter setting
    wal_archive=true
needs to be added to make XLogArchive work or not.
I've not added this to the install template (yet), in case we had some
further suggestions for what this might be called.
-related changes
-user documentation

-changes to wal_debug
The XLOG_DEBUG flag is set as a value between 1 and 16, though the code
only ever treats this as a boolean. For my development, I partially
implemented an earlier suggestion of mine: set the flag to 1 in the
config file, then set the more verbose portions of debug output to
trigger when its set to 16. That effected a couple of places in xlog.c.
That may not be needed, so thats not included either.

-user documentation
Not yet...but it will be.

> Bugs
> - two bugs currently occur during some tests:
> 1. the notification mechanism as originally designed causes ALL backends
> to report that a log file has closed. That works most of the time,
> though does give rise to occasional timing errors - nothing too
> serious, but this inexactness could lead to later errors.
> 2. After restore, the notification system doesn't recover fully - this
> is a straightforward one

Attachment

Re: PITR Phase 1 - Test results

From

"Glen Parker"

Date:

26 April 2004, 18:48:24

> I want to come hug you --- where do you live?  !!!

You're not the only one.  But we don't want to smother the poor guy, at
least not before he completes his work :-)

Re: PITR Phase 1 - Test results

From

Simon Riggs

Date:

26 April 2004, 20:46:59

On Mon, 2004-04-26 at 18:08, Bruce Momjian wrote:
> Simon Riggs wrote:
> > 
> > Well, I guess I was fairly happy too :-)
> 
> YES!
> 
> > I'd be more comfortable if I'd found more bugs though, but I'm sure the
> > kind folk on this list will see that wish of mine comes true!
> > 
> > The code is in a "needs more polishing" state - which is just the right
> > time for some last discussions before everything sets too solid.
> 
> Once we see the patch, we will be able to eyeball all the code paths and
> interface to existing code and will be able to spot a lot of stuff, I am
> sure.
> 
> It might take a few passes over it but you will get all the support and
> ideas we have.

Thanks very much.

Code will be there in full tomorrow now (oh it is tomorrow...)

Fixed the bugs that I spoke of earlier though. They all make sense when
you try to tell someone else about them...

Best Regards, Simon

Re: PITR Phase 1 - Code Overview (1)

From

Peter Eisentraut

Date:

27 April 2004, 14:10:23

Simon Riggs wrote:
> New utility aimed at being located in src/bin/pg_arch

Why isn't the archiver process integrated into the server?

Re: PITR Phase 1 - Code Overview (1)

From

Bruce Momjian

Date:

27 April 2004, 14:59:32

Peter Eisentraut wrote:
> Simon Riggs wrote:
> > New utility aimed at being located in src/bin/pg_arch
> 
> Why isn't the archiver process integrated into the server?

I think it is because the archiver process has to be started/stopped
independently of the server.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR Phase 1 - Code Overview (1)

From

Simon Riggs

Date:

27 April 2004, 17:21:32

On Tue, 2004-04-27 at 18:10, Peter Eisentraut wrote:
> Simon Riggs wrote:
> > New utility aimed at being located in src/bin/pg_arch
> 
> Why isn't the archiver process integrated into the server?
> 

Number of reasons....

Overall, I initially favoured the archiver as another special backend,
like checkpoint. That is exactly the same architecture as Oracle uses,
so is a good starting place for thought.

We discussed the design in detail on the list and the suggestion was
made to implement PITR using an API to send notification to an archiver.
In Oracle7, it was considered OK to just dump the files in some
directory and call them archived. Later, most DBMSs have gone to some
trouble to integrate with generic or at least market leading backup and
recovery (BAR) software products. Informix and DB2 provide open
interfaces to BARs; Oracle does not, but then it figures it already
(had) market share, so we'll just do it our way.

The XLogArchive design allows ANY external archiver to work with
PostgreSQL. The pg_arch program supplied is really to show how that
might be implemented. This leaves the door open for any BAR product to
interface through to PostgreSQL, whether this be your favourite open
source BAR or the leading proprietary vendors. 

Wide adoption is an important design feature and the design presented
offers this.

The other reason is to do with how and when archival takes place. An
asynchronous communication mechanism is required between PostgreSQL and
the archiver, to allow for such situations as tape mounts or simple
failure of the archiver. The method chosen for implementing this
asynchronous comms mechanism lends itself to being an external API -
there were other designs but these were limited to internal use only.

You ask a reasonable question however. If pg_autovacuum exists, why
should pg_autoarch not work also? My own thinking about external
connectivity may have overshadowed my thinking there.

It would not require too much additional work to add another GUC which
gives the name of the external archiver to confirm execution of, or
start/restart if it fails. At this point, such a feature is a nice to
have in comparison with the goal of being able to recover to a PIT, so I
will defer this issue to Phase 3....

Best regards, Simon Riggs

Re: PITR Phase 1 - Code Overview (1)

From

Peter Eisentraut

Date:

28 April 2004, 12:04:05

Am Tuesday 27 April 2004 22:21 schrieb Simon Riggs:
> > Why isn't the archiver process integrated into the server?

> You ask a reasonable question however. If pg_autovacuum exists, why
> should pg_autoarch not work also?

pg_autovacuum is going away to be integrated as a backend process.

Re: PITR Phase 1 - Code Overview (1)

From

Peter Eisentraut

Date:

28 April 2004, 12:14:44

Am Tuesday 27 April 2004 19:59 schrieb Bruce Momjian:
> Peter Eisentraut wrote:
> > Simon Riggs wrote:
> > > New utility aimed at being located in src/bin/pg_arch
> >
> > Why isn't the archiver process integrated into the server?
>
> I think it is because the archiver process has to be started/stopped
> independently of the server.

When the server is not running there is nothing to archive, so I don't follow 
this argument.

Re: PITR Phase 1 - Code Overview (1)

From

Peter Eisentraut

Date:

28 April 2004, 12:16:16

Am Monday 26 April 2004 23:11 schrieb Simon Riggs:
> ii) encourage (i.e. force) people using XLogArchive API to run as the
> PostgreSQL owning-user (postgres).

I think this is perfectly reasonable.

Re: PITR Phase 1 - Code Overview (1)

From

Simon Riggs

Date:

28 April 2004, 13:47:53

On Wed, 2004-04-28 at 16:14, Peter Eisentraut wrote:
> Am Tuesday 27 April 2004 19:59 schrieb Bruce Momjian:
> > Peter Eisentraut wrote:
> > > Simon Riggs wrote:
> > > > New utility aimed at being located in src/bin/pg_arch
> > >
> > > Why isn't the archiver process integrated into the server?
> >
> > I think it is because the archiver process has to be started/stopped
> > independently of the server.
> 
> When the server is not running there is nothing to archive, so I don't follow 
> this argument.

The running server creates xlogs, which are still available for archive
even when the server is not running...

Overall, your point is taken, with many additional comments in my other
posts in reply to you.

I accept that this may be desirable in the future, for some simple
implementations. The pg_autovacuum evolution path is a good model - if
it works and the code is stable, bring it under the postmaster at a
later time.

Best Regards, Simon Riggs

PITR logging control program

From

Bruce Momjian

Date:

29 April 2004, 01:19:11

Simon Riggs wrote:
> > When the server is not running there is nothing to archive, so I don't follow 
> > this argument.
> 
> The running server creates xlogs, which are still available for archive
> even when the server is not running...
> 
> Overall, your point is taken, with many additional comments in my other
> posts in reply to you.
> 
> I accept that this may be desirable in the future, for some simple
> implementations. The pg_autovacuum evolution path is a good model - if
> it works and the code is stable, bring it under the postmaster at a
> later time.

[ This email isn't focused because I haven't resolved all my ideas yet.]

OK, I looked over the code.  Basically it appears pg_arch is a
client-side program that copies files from pg_xlog to a specified
directory, and marks completion in a new pg_rlog directory.

The driving part of the program seems to be:
   while ( (n = read( xlogfd, buf, BLCKSZ)) > 0)       if ( write( archfd, buf, n) != n)           return false;

The program basically sleeps and when it awakes checks to see if new WAL
files have been created.

There is some additional GUC variable to prevent WAL from being recycled
until it has been archived, but the posted patch only had pg_arch.c, its
Makefile, and a patch to update bin/Makefile.

Simon (the submitter) specified he was providing an API to archive, but
it is really just a set of C routines to call that do copies.  It is not
a wire protocol or anything like that.

The program has a mode where it archives all available wal files and
exits, but by default it has to remain running to continue archiving.

I am wondering if this is the way to approach the situation.  I
apologize for not considering this earlier.  Archives of PITR postings
of interest are at:
http://momjian.postgresql.org/cgi-bin/pgtodo?pitr

It seems the backend is the one who knows right away when a new WAL file
has been created and needs to be archived.

Also, are folks happy with archiving only full WAL files?  This will not
restore all transactions up to the point of failure, but might lose
perhaps 2-5 minutes of transactions before the failure.

Also, a client application is a separate process that must remain
running.  With Informix, there is a separate utility to do PITR logging.
It is a pain to have to make sure a separate process is always running.

Here is an idea.  What if we add two GUC settings:
pitr = true/false;pitr_path = 'filename or |program';

In this way, you would basically specify your path to dump all WAL logs
into (just keep appending 16MB chunks) or call a program that you pipe
all the WAL logs into.

You can't change pitr_path while pitr is on.  Each backend opens the
filename in append mode before writing.  One problem is that this slows
down the backend because it has to do the write, and it might be slow.

We also need the ability to write to a tape drive, and you can't
open/close those like a file.  Different backends will be doing the WAL
file additions, there isn't a central process to keep a tape drive file
descriptor open.

Seems pg_arch should at least use libpq to connect to a database and do
a LISTEN and have the backend NOTIFY when they create a new WAL file or
something.  Polling for new WAL files seems non-optimal, but maybe a
database connection is overkill.

Then, you start the backend, specify the path, turn on pitr, do the tar,
and you are on your way.   

Also, pg_arch should only be run the the install user.  No need to allow
other users to run this.

Another idea is to have a client program like pg_ctl that controls PITR
logging (start, stop, location), but does its job and exits, rather than
remains running.

I apologies for not bringing up these issues earlier.  I didn't realize
the direction it was going.  I wasn't focused on it.  Sorry.

-- Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR logging control program

From

Alvaro Herrera

Date:

29 April 2004, 10:52:16

On Thu, Apr 29, 2004 at 12:18:38AM -0400, Bruce Momjian wrote:

> OK, I looked over the code.  Basically it appears pg_arch is a
> client-side program that copies files from pg_xlog to a specified
> directory, and marks completion in a new pg_rlog directory.
> 
> The driving part of the program seems to be:
> 
>     while ( (n = read( xlogfd, buf, BLCKSZ)) > 0)
>         if ( write( archfd, buf, n) != n)
>             return false;
> 
> The program basically sleeps and when it awakes checks to see if new WAL
> files have been created.

Is the API able to indicate a written but not-yet-filled WAL segment?
So an archiver could copy the filled part, and refill it later.  This
may be needed because a segment could take a while to be filled.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Hoy es el primer día del resto de mi vida"

Re: PITR logging control program

From

Alvaro Herrera

Date:

29 April 2004, 11:12:21

On Thu, Apr 29, 2004 at 10:07:01AM -0400, Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > Is the API able to indicate a written but not-yet-filled WAL segment?
> > So an archiver could copy the filled part, and refill it later.  This
> > may be needed because a segment could take a while to be filled.
> 
> I couldn't figure that out, but I don't think it does.  It would have to
> lock the WAL writes so it could get a good copy, I think, and I didn't
> see that.

I'm not sure but I don't think so.  You don't have to lock the WAL for
writing, because it will always write later in the file than you are
allowed to read.  (If you read more than you were told to, it's your
fault as an archiver.)

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Et put se mouve" (Galileo Galilei)

Re: PITR logging control program

From

Bruce Momjian

Date:

29 April 2004, 11:13:53

Alvaro Herrera wrote:
> On Thu, Apr 29, 2004 at 12:18:38AM -0400, Bruce Momjian wrote:
> 
> > OK, I looked over the code.  Basically it appears pg_arch is a
> > client-side program that copies files from pg_xlog to a specified
> > directory, and marks completion in a new pg_rlog directory.
> > 
> > The driving part of the program seems to be:
> > 
> >     while ( (n = read( xlogfd, buf, BLCKSZ)) > 0)
> >         if ( write( archfd, buf, n) != n)
> >             return false;
> > 
> > The program basically sleeps and when it awakes checks to see if new WAL
> > files have been created.
> 
> Is the API able to indicate a written but not-yet-filled WAL segment?
> So an archiver could copy the filled part, and refill it later.  This
> may be needed because a segment could take a while to be filled.

I couldn't figure that out, but I don't think it does.  It would have to
lock the WAL writes so it could get a good copy, I think, and I didn't
see that.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR logging control program

From

Bruce Momjian

Date:

29 April 2004, 11:25:23

Alvaro Herrera wrote:
> On Thu, Apr 29, 2004 at 10:07:01AM -0400, Bruce Momjian wrote:
> > Alvaro Herrera wrote:
> 
> > > Is the API able to indicate a written but not-yet-filled WAL segment?
> > > So an archiver could copy the filled part, and refill it later.  This
> > > may be needed because a segment could take a while to be filled.
> > 
> > I couldn't figure that out, but I don't think it does.  It would have to
> > lock the WAL writes so it could get a good copy, I think, and I didn't
> > see that.
> 
> I'm not sure but I don't think so.  You don't have to lock the WAL for
> writing, because it will always write later in the file than you are
> allowed to read.  (If you read more than you were told to, it's your
> fault as an archiver.)

My point was that without locking the WAL, we might get part of a WAL
write in our file, but I now realize that during a crash the same thing
might happen, so it would be OK to just copy it even if it is being
written to.

Simon posted the rest of his patch that shows changes to the backend,
and a comment reads:

+  * The name of the notification file is the message that will be picked up
+  * by the archiver, e.g. we write RLogDir/00000001000000C6.full
+  * and the archiver then knows to archive XLOgDir/00000001000000C6,
+  * while it is doing so it will rename RLogDir/00000001000000C6.full
+  * to RLogDir/00000001000000C6.busy, then when complete, rename it again
+  * to RLogDir/00000001000000C6.done

so it is only archiving full logs.

Also, I think this archiver should be able to log to a local drive,
network drive (trivial), tape drive, ftp, or use an external script to
transfer the logs somewhere.  (ftp would probably be an external script
with 'expect').

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR logging control program

From

Simon Riggs

Date:

29 April 2004, 15:35:27

On Thu, 2004-04-29 at 15:22, Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > On Thu, Apr 29, 2004 at 10:07:01AM -0400, Bruce Momjian wrote:
> > > Alvaro Herrera wrote:
> > 
> > > > Is the API able to indicate a written but not-yet-filled WAL segment?
> > > > So an archiver could copy the filled part, and refill it later.  This
> > > > may be needed because a segment could take a while to be filled.
> > > 
> > > I couldn't figure that out, but I don't think it does.  It would have to
> > > lock the WAL writes so it could get a good copy, I think, and I didn't
> > > see that.
> > 
> > I'm not sure but I don't think so.  You don't have to lock the WAL for
> > writing, because it will always write later in the file than you are
> > allowed to read.  (If you read more than you were told to, it's your
> > fault as an archiver.)
> 
> My point was that without locking the WAL, we might get part of a WAL
> write in our file, but I now realize that during a crash the same thing
> might happen, so it would be OK to just copy it even if it is being
> written to.
> 
> Simon posted the rest of his patch that shows changes to the backend,
> and a comment reads:
> 
> +  * The name of the notification file is the message that will be picked up
> +  * by the archiver, e.g. we write RLogDir/00000001000000C6.full
> +  * and the archiver then knows to archive XLOgDir/00000001000000C6,
> +  * while it is doing so it will rename RLogDir/00000001000000C6.full
> +  * to RLogDir/00000001000000C6.busy, then when complete, rename it again
> +  * to RLogDir/00000001000000C6.done
> 
> so it is only archiving full logs.
> 
> Also, I think this archiver should be able to log to a local drive,
> network drive (trivial), tape drive, ftp, or use an external script to
> transfer the logs somewhere.  (ftp would probably be an external script
> with 'expect').

Bruce is correct, the API waits for the archive to be full before
archiving. 

I had thought about the case for partial archiving: basically, if you
want to archive in smaller chunks, make your log files smaller...this is
now a compile time option. Possibly there is an argument to make the
xlog file size configurable, as a way of doing what you suggest.

Taking multiple copies of the same file, yet trying to work out which
one to apply sounds complex and error prone to me. It also increases the
cost of the archival process and thus drains other resources.

The archiver should be able to do a whole range of things. Basically,
that point was discussed and the agreed approach was to provide an API
that would allow anybody and everybody to write whatever they wanted.
The design included pg_arch since it was clear that there would be a
requirement in the basic product to have those facilities - and in any
case any practically focused API has a reference port as a way of
showing how to use it and exposing any bugs in the server side
implementation.

The point is...everybody is now empowered to write tape drive code,
whatever you fancy.... go do.

Best regards, Simon Riggs

Re: PITR logging control program

From

Bruce Momjian

Date:

29 April 2004, 16:25:14

Simon Riggs wrote:
> > Also, I think this archiver should be able to log to a local drive,
> > network drive (trivial), tape drive, ftp, or use an external script to
> > transfer the logs somewhere.  (ftp would probably be an external script
> > with 'expect').
> 
> Bruce is correct, the API waits for the archive to be full before
> archiving. 
> 
> I had thought about the case for partial archiving: basically, if you
> want to archive in smaller chunks, make your log files smaller...this is
> now a compile time option. Possibly there is an argument to make the
> xlog file size configurable, as a way of doing what you suggest.
> 
> Taking multiple copies of the same file, yet trying to work out which
> one to apply sounds complex and error prone to me. It also increases the
> cost of the archival process and thus drains other resources.
> 
> The archiver should be able to do a whole range of things. Basically,
> that point was discussed and the agreed approach was to provide an API
> that would allow anybody and everybody to write whatever they wanted.
> The design included pg_arch since it was clear that there would be a
> requirement in the basic product to have those facilities - and in any
> case any practically focused API has a reference port as a way of
> showing how to use it and exposing any bugs in the server side
> implementation.
> 
> The point is...everybody is now empowered to write tape drive code,
> whatever you fancy.... go do.

Agreed we want to allow the superuser control over writing of the
archive logs.  The question is how do they get access to that.  Is it by
running a client program continuously or calling an interface script
from the backend?

My point was that having the backend call the program has improved
reliablity and control over when to write, and easier administration.

How are people going to run pg_arch?  Via nohup?  In virtual screens? If
I am at the console and I want to start it, do I use "&"?  If I want to
stop it, do I do a 'ps' and issue a 'kill'?  This doesn't seem like a
good user interface to me.

To me the problem isn't pg_arch itself but the idea that a client
program is going to be independently finding(polling) and copying of the
archive logs.

I am thinking the client program is called with two arguments, the xlog
file name, and the arch location defined in GUC.  Then the client
program does the write.  The problem there though is who gets the write
error since the backend will not wait around for completion?

Another case is server start/stop.  You want to start/stop the archive
logger to match the database server, particularly if you reboot the
server.  I know Informix used a client program for logging, and it was a
pain to administer.

I would be happy with an exteral program if it was started/stoped by the
postmaster (or via GUC change) and received a signal when a WAL file was
written.  But if we do that, it isn't really an external program anymore
but another child process like our stats collector.

I am willing to work on this if folks think this is a better approach.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR logging control program

From

Alvaro Herrera

Date:

29 April 2004, 17:24:43

On Thu, Apr 29, 2004 at 07:34:47PM +0100, Simon Riggs wrote:

> Bruce is correct, the API waits for the archive to be full before
> archiving. 
> 
> I had thought about the case for partial archiving: basically, if you
> want to archive in smaller chunks, make your log files smaller...this is
> now a compile time option. Possibly there is an argument to make the
> xlog file size configurable, as a way of doing what you suggest.
> 
> Taking multiple copies of the same file, yet trying to work out which
> one to apply sounds complex and error prone to me. It also increases the
> cost of the archival process and thus drains other resources.

My idea was basically that the archiver could be told "I've finished
writing XLog segment 1 until byte 9000", so the archiver would

dd if=xlog-1 seek=0 skip=0 bs=1c count=9000c of=archive-1

And later, it would get a notification "segment 1 until byte 18000" he does

dd if=xlog-1 seek=0 skip=0 bs=1c count=18000c of=archive-1

Or, if it's smart enough,

dd if=xlog-1 seek=9000c skip=9000c bs=1c count=9000c of=archive-1

Basically it is updating the logs as soon as it receives the
notifications.  Writing 16 MB of xlogs could take some time.

When a full xlog segment has been written, a different kind of
notification can be issued.  A dumb archiver could just ignore the
incremental ones and copy the files only upon receiving this other kind.


I think that if log files are too small, maybe it will be a waste of
resources (which ones?).  Anyway, it's just an idea.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)

Re: PITR logging control program

From

Simon Riggs

Date:

29 April 2004, 18:01:36

On Thu, 2004-04-29 at 20:24, Bruce Momjian wrote:
> I am willing to work on this... 

There is much work still to be done to make PITR work..accepting all of
the many comments made.

If anybody wants this by 1 June, I think we'd better look sharp. My aim
has been to knock one of the URGENT items on the TODO list into touch,
however that was to be achieved.

The following work remains...from all that has been said...
- halt restore at particular condition (point in time, txnid etc)
- archive policy to control whether to halt database should archiving
fail and space run out (as Oracle, Db2 do), or not (as discussed)
- cope with restoring a stream of logs larger than the disk space on the
restoration target system
- integrate restore with tablespace code, to allow tablespace backups
- build XLogSpy mechanism to allow DBA to better know when to recover to
- extend logging mechanism to allow recovery time prediction
- publicise the API with BAR open source teams, to get feedback and to
encourage them to use the API to allow PostgreSQL support for their BAR
- use the API to build interfaces to the 100+ BAR products on the market
- performance tuning of xlogs, to ensure minimum xlog volume written
- performance tuning of recovery, to ensure wasted effort avoided
- allow archiver utility to be managed by postmaster
- write some good documentation
- comprehensive crash testing
- really comprehensive crash testing
- very comprehensive crash testing

It seems worth working on things in some kind of priority order.

I claim these, by the way, but many others look important and
interesting to me:
- halt restore at particular condition (point in time, txnid etc)
- cope with restoring a stream of logs larger than the disk space on the
restoration target system
- write some good documentation

Best Regards, Simon Riggs

Re: PITR logging control program

From

Simon Riggs

Date:

29 April 2004, 18:02:19

On Thu, 2004-04-29 at 20:24, Bruce Momjian wrote:
> Simon Riggs wrote:
> > 
> > The archiver should be able to do a whole range of things. Basically,
> > that point was discussed and the agreed approach was to provide an API
> > that would allow anybody and everybody to write whatever they wanted.
> > The design included pg_arch since it was clear that there would be a
> > requirement in the basic product to have those facilities - and in any
> > case any practically focused API has a reference port as a way of
> > showing how to use it and exposing any bugs in the server side
> > implementation.
> > 
> > The point is...everybody is now empowered to write tape drive code,
> > whatever you fancy.... go do.
> 
> Agreed we want to allow the superuser control over writing of the
> archive logs.  The question is how do they get access to that.  Is it by
> running a client program continuously or calling an interface script
> from the backend?
> 
> My point was that having the backend call the program has improved
> reliablity and control over when to write, and easier administration.
> 

Agreed. We've both suggested ways that can occur, though I suggest this
is much less of a priority, for now. Not "no", just not "now".

> How are people going to run pg_arch?  Via nohup?  In virtual screens? If
> I am at the console and I want to start it, do I use "&"?  If I want to
> stop it, do I do a 'ps' and issue a 'kill'?  This doesn't seem like a
> good user interface to me.
> 
> To me the problem isn't pg_arch itself but the idea that a client
> program is going to be independently finding(polling) and copying of the
> archive logs.
> 
> I am thinking the client program is called with two arguments, the xlog
> file name, and the arch location defined in GUC.  Then the client
> program does the write.  The problem there though is who gets the write
> error since the backend will not wait around for completion?
> 
> Another case is server start/stop.  You want to start/stop the archive
> logger to match the database server, particularly if you reboot the
> server.  I know Informix used a client program for logging, and it was a
> pain to administer.
> 

pg_arch is just icing on top of the API. The API is the real deal here.
I'm not bothered if pg_arch is not accepted, as long as we can adopt the
API. As noted previously, my original mind was to split the API away
from the pg_arch application to make it clearer what was what. Once that
has been done, I encourage others to improve pg_arch - but also to use
the API to interface with other BAR prodiucts.

If you're using PostgreSQL for serious business then you will be using a
serious BAR product as well. There are many FOSS alternatives...

The API's purpose is to allow larger, pre-existing BAR products to know
when and how to retrieve data from PostgreSQL. Those products don't and
won't run underneath postmaster, so although I agree with Peter's
original train of thought, I also agree with Tom's suggestion that we
need an API more than we need an archiver process. 

I would be happy with an exteral program if it was started/stoped by the
> postmaster (or via GUC change) and received a signal when a WAL file was
> written.  

That is exactly what has been written.

The PostgreSQL side of the API is written directly into the backend, in
xlog.c and is therefore activated by postmaster controlled code. That
then sends "a signal" to the process that will do the archiving - the
Archiver side of the XLogArchive API has it as an in-process library.
(The "signal" is, in fact, a zero-length file written to disk because
there are many reasons why an external archiver may not be ready to
archive or even up and running to receive a signal).

The only difference is that there is some confusion as to the role and
importance of pg_arch.

Best Regards, Simon Riggs

Re: PITR logging control program

From

Bruce Momjian

Date:

30 April 2004, 00:02:56

Simon Riggs wrote:
> > Agreed we want to allow the superuser control over writing of the
> > archive logs.  The question is how do they get access to that.  Is it by
> > running a client program continuously or calling an interface script
> > from the backend?
> > 
> > My point was that having the backend call the program has improved
> > reliablity and control over when to write, and easier administration.
> > 
> 
> Agreed. We've both suggested ways that can occur, though I suggest this
> is much less of a priority, for now. Not "no", just not "now".
> 
> > Another case is server start/stop.  You want to start/stop the archive
> > logger to match the database server, particularly if you reboot the
> > server.  I know Informix used a client program for logging, and it was a
> > pain to administer.
> > 
> 
> pg_arch is just icing on top of the API. The API is the real deal here.
> I'm not bothered if pg_arch is not accepted, as long as we can adopt the
> API. As noted previously, my original mind was to split the API away
> from the pg_arch application to make it clearer what was what. Once that
> has been done, I encourage others to improve pg_arch - but also to use
> the API to interface with other BAR prodiucts.
> 
> If you're using PostgreSQL for serious business then you will be using a
> serious BAR product as well. There are many FOSS alternatives...
> 
> The API's purpose is to allow larger, pre-existing BAR products to know
> when and how to retrieve data from PostgreSQL. Those products don't and
> won't run underneath postmaster, so although I agree with Peter's
> original train of thought, I also agree with Tom's suggestion that we
> need an API more than we need an archiver process. 
> 
> I would be happy with an exteral program if it was started/stoped by the
> > postmaster (or via GUC change) and received a signal when a WAL file was
> > written.  
> 
> That is exactly what has been written.
> 
> The PostgreSQL side of the API is written directly into the backend, in
> xlog.c and is therefore activated by postmaster controlled code. That
> then sends "a signal" to the process that will do the archiving - the
> Archiver side of the XLogArchive API has it as an in-process library.
> (The "signal" is, in fact, a zero-length file written to disk because
> there are many reasons why an external archiver may not be ready to
> archive or even up and running to receive a signal).
> 
> The only difference is that there is some confusion as to the role and
> importance of pg_arch.

OK, I have finalized my thinking on this.

We both agree that a pg_arch client-side program certainly works for
PITR logging.  The big question in my mind is whether a client-side
program is what we want to use long-term, and whether we want to release
a 7.5 that uses it and then change it in 7.6 to something more
integrated into the backend.

Let me add this is a little different from pg_autovacuum.  With that,
you could put it in cron and be done with it.  With pg_arch, there is a
routine that has to be used to do PITR, and if we change the process in
7.6, I am afraid there will be confusion.

Let me also add that I am not terribly worried about having the feature
to restore to an arbitrary point in time for 7.5.  I would much rather
have a good PITR solution that works cleanly in 7.5 and add it to 7.6,
than to have retore to an arbitrary point but have a strained
implementation that we have to revisit for 7.6.

Here are my ideas.  (I talked to Tom about this and am including his
ideas too.)  Basically, the archiver that scans the xlog directory to
identify files to be archived should be a subprocess of the postmaster. 
You already have that code and it can be moved into the backend.

Here is my implementation idea.  First, your pg_arch code runs in the
backend and is started just like the statistics process.  It has to be
started whether PITR is being used or not, but will be inactive if PITR
isn't enabled.  This must be done because we can't have a backend start
this process later in case they turn on PITR after server start.

The process id of the archive process is stored in shared memory.  When
PITR is turned on, each backend that complete a WAL file sends a signal
to the archiver process.  The archiver wakes up on the signal and scans
the directory, finds files that need archiving, and either does a 'cp'
or runs a user-defined program (like scp) to transfer the file to the
archive location.

In GUC we add:
pitr = true/falsepitr_location = 'directory, user@host:/dir, etc'pitr_transfer = 'cp, scp, etc'

The archiver program updates its config values when someone changes
these values via postgresql.conf (and uses pg_ctl reload).  These can
only be modified from postgresql.conf.  Changing them via SET has to be
disabled because they are cluster-level settings, not per session, like
port number or checkpoint_segments.

Basically, I think that we need to push user-level control of this
process down beyond the directory scanning code (that is pretty
standard), and allow them to call an arbitrary program to transfer the
logs.  My idea is that the pitr_transfer program will get $1=WAL file
name and $2=pitr_location and the program can use those arguments to do
the transfer.  We can even put a pitr_transfer.sample program in share
and document $1 and $2.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: PITR logging control program

From

Simon Riggs

Date:

30 April 2004, 04:04:39

On Fri, 2004-04-30 at 04:02, Bruce Momjian wrote:

> Let me also add that I am not terribly worried about having the feature
> to restore to an arbitrary point in time for 7.5.  I would much rather
> have a good PITR solution that works cleanly in 7.5 and add it to 7.6,
> than to have retore to an arbitrary point but have a strained
> implementation that we have to revisit for 7.6.
> 

Interesting thought, I see now your priorities.

Will read and digest over next few days.

Thanks for your help and attention,

Best regards, Simon Riggs

Re: PITR logging control program

From

Simon Riggs

Date:

04 May 2004, 18:50:32

On Fri, 2004-04-30 at 04:02, Bruce Momjian wrote:
> Simon Riggs wrote:
> > > Agreed we want to allow the superuser control over writing of the
> > > archive logs.  The question is how do they get access to that.  Is it by
> > > running a client program continuously or calling an interface script
> > > from the backend?
> > > 
> > > My point was that having the backend call the program has improved
> > > reliablity and control over when to write, and easier administration.
> > > 
> > 
> > Agreed. We've both suggested ways that can occur, though I suggest this
> > is much less of a priority, for now. Not "no", just not "now".
> > 
> > > Another case is server start/stop.  You want to start/stop the archive
> > > logger to match the database server, particularly if you reboot the
> > > server.  I know Informix used a client program for logging, and it was a
> > > pain to administer.
> > > 
> > 
> > pg_arch is just icing on top of the API. The API is the real deal here.
> > I'm not bothered if pg_arch is not accepted, as long as we can adopt the
> > API. As noted previously, my original mind was to split the API away
> > from the pg_arch application to make it clearer what was what. Once that
> > has been done, I encourage others to improve pg_arch - but also to use
> > the API to interface with other BAR prodiucts.
> > 
> > If you're using PostgreSQL for serious business then you will be using a
> > serious BAR product as well. There are many FOSS alternatives...
> > 
> > The API's purpose is to allow larger, pre-existing BAR products to know
> > when and how to retrieve data from PostgreSQL. Those products don't and
> > won't run underneath postmaster, so although I agree with Peter's
> > original train of thought, I also agree with Tom's suggestion that we
> > need an API more than we need an archiver process. 
> > 
> > I would be happy with an exteral program if it was started/stoped by the
> > > postmaster (or via GUC change) and received a signal when a WAL file was
> > > written.  
> > 
> > That is exactly what has been written.
> > 
> > The PostgreSQL side of the API is written directly into the backend, in
> > xlog.c and is therefore activated by postmaster controlled code. That
> > then sends "a signal" to the process that will do the archiving - the
> > Archiver side of the XLogArchive API has it as an in-process library.
> > (The "signal" is, in fact, a zero-length file written to disk because
> > there are many reasons why an external archiver may not be ready to
> > archive or even up and running to receive a signal).
> > 
> > The only difference is that there is some confusion as to the role and
> > importance of pg_arch.
> 
> OK, I have finalized my thinking on this.
> 
> We both agree that a pg_arch client-side program certainly works for
> PITR logging.  The big question in my mind is whether a client-side
> program is what we want to use long-term, and whether we want to release
> a 7.5 that uses it and then change it in 7.6 to something more
> integrated into the backend.
> 
> Let me add this is a little different from pg_autovacuum.  With that,
> you could put it in cron and be done with it.  With pg_arch, there is a
> routine that has to be used to do PITR, and if we change the process in
> 7.6, I am afraid there will be confusion.
> 
> Let me also add that I am not terribly worried about having the feature
> to restore to an arbitrary point in time for 7.5.  I would much rather
> have a good PITR solution that works cleanly in 7.5 and add it to 7.6,
> than to have retore to an arbitrary point but have a strained
> implementation that we have to revisit for 7.6.
> 
> Here are my ideas.  (I talked to Tom about this and am including his
> ideas too.)  Basically, the archiver that scans the xlog directory to
> identify files to be archived should be a subprocess of the postmaster. 
> You already have that code and it can be moved into the backend.
> 
> Here is my implementation idea.  First, your pg_arch code runs in the
> backend and is started just like the statistics process.  It has to be
> started whether PITR is being used or not, but will be inactive if PITR
> isn't enabled.  This must be done because we can't have a backend start
> this process later in case they turn on PITR after server start.
> 
> The process id of the archive process is stored in shared memory.  When
> PITR is turned on, each backend that complete a WAL file sends a signal
> to the archiver process.  The archiver wakes up on the signal and scans
> the directory, finds files that need archiving, and either does a 'cp'
> or runs a user-defined program (like scp) to transfer the file to the
> archive location.
> 
> In GUC we add:
> 
>     pitr = true/false
>     pitr_location = 'directory, user@host:/dir, etc'
>     pitr_transfer = 'cp, scp, etc'
> 
> The archiver program updates its config values when someone changes
> these values via postgresql.conf (and uses pg_ctl reload).  These can
> only be modified from postgresql.conf.  Changing them via SET has to be
> disabled because they are cluster-level settings, not per session, like
> port number or checkpoint_segments.
> 
> Basically, I think that we need to push user-level control of this
> process down beyond the directory scanning code (that is pretty
> standard), and allow them to call an arbitrary program to transfer the
> logs.  My idea is that the pitr_transfer program will get $1=WAL file
> name and $2=pitr_location and the program can use those arguments to do
> the transfer.  We can even put a pitr_transfer.sample program in share
> and document $1 and $2.

...Bruce and I have just discussed this in some detail and reached a
good understanding of the design proposals as a whole. It looks like all
of this can happen in the next few weeks, with a worst case time
estimate of mid-June. TGFT!

I'll write this up and post this shortly, with a rough roadmap for
further development of recovery-related features.

Best Regards,

Simon Riggs
2nd Quadrant