Thread: Re: [HACKERS] Point in Time Recovery

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
PITR Patch v5_1 just posted has Point in Time Recovery working....

Still some rough edges....but we really need some testers now to give
this a try and let me know what you think.

Klaus Naumann and Mark Wong are the only [non-committers] to have tried
to run the code (and let me know about it), so please have a look at
[PATCHES] and try it out.

Many thanks,

Simon Riggs


Re: [HACKERS] Point in Time Recovery

From
Christopher Kings-Lynne
Date:
Can you give us some suggestions of what kind of stuff to test?  Is
there a way we can artificially kill the backend in all sorts of nasty
spots to see if recovery works?  Does kill -9 simulate a 'power off'?

Chris

Simon Riggs wrote:

> PITR Patch v5_1 just posted has Point in Time Recovery working....
>
> Still some rough edges....but we really need some testers now to give
> this a try and let me know what you think.
>
> Klaus Naumann and Mark Wong are the only [non-committers] to have tried
> to run the code (and let me know about it), so please have a look at
> [PATCHES] and try it out.
>
> Many thanks,
>
> Simon Riggs
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Wed, 2004-07-14 at 03:31, Christopher Kings-Lynne wrote:
> Can you give us some suggestions of what kind of stuff to test?  Is
> there a way we can artificially kill the backend in all sorts of nasty
> spots to see if recovery works?  Does kill -9 simulate a 'power off'?
>

I was hoping some fiendish plans would be presented to me...

But please start with "this feels like typical usage" and we'll go from
there...the important thing is to try the first one.

I've not done power off tests, yet. They need to be done just to
check...actually you don't need to do this to test PITR...

We need to exhaustive tests of...
- power off
- scp and cross network copies
- all the permuted recovery options
- archive_mode = off (i.e. current behaviour)
- deliberately incorrectly set options (idiot-proof testing)

I'd love some help assembling a test document with numbered tests...

Best regards, Simon Riggs


Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> I've not done power off tests, yet. They need to be done just to
> check...actually you don't need to do this to test PITR...

I agree, power off is not really the point here.  What we need to check
into is (a) the mechanics of archiving WAL segments and (b) the
process of restoring given a backup and a bunch of WAL segments.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
markw@osdl.org
Date:
On 14 Jul, Simon Riggs wrote:
> PITR Patch v5_1 just posted has Point in Time Recovery working....
>
> Still some rough edges....but we really need some testers now to give
> this a try and let me know what you think.
>
> Klaus Naumann and Mark Wong are the only [non-committers] to have tried
> to run the code (and let me know about it), so please have a look at
> [PATCHES] and try it out.
>
> Many thanks,
>
> Simon Riggs

Simon,

I just tried applying the v5_1 patch against the cvs tip today and got a
couple of rejections.  I'll copy the patch output here.  Let me know if
you want to see the reject files or anything else:

$ patch -p0 < ../../../pitr-v5_1.diff
patching file backend/access/nbtree/nbtsort.c
Hunk #2 FAILED at 221.
1 out of 2 hunks FAILED -- saving rejects to file backend/access/nbtree/nbtsort.c.rej
patching file backend/access/transam/xlog.c
Hunk #11 FAILED at 1802.
Hunk #15 FAILED at 2152.
Hunk #16 FAILED at 2202.
Hunk #21 FAILED at 3450.
Hunk #23 FAILED at 3539.
Hunk #25 FAILED at 3582.
Hunk #26 FAILED at 3833.
Hunk #27 succeeded at 3883 with fuzz 2.
Hunk #28 FAILED at 4446.
Hunk #29 succeeded at 4470 with fuzz 2.
8 out of 29 hunks FAILED -- saving rejects to file backend/access/transam/xlog.c.rej
patching file backend/postmaster/Makefile
patching file backend/postmaster/postmaster.c
Hunk #3 succeeded at 1218 with fuzz 2 (offset 70 lines).
Hunk #4 succeeded at 1827 (offset 70 lines).
Hunk #5 succeeded at 1874 (offset 70 lines).
Hunk #6 succeeded at 1894 (offset 70 lines).
Hunk #7 FAILED at 1985.
Hunk #8 succeeded at 2039 (offset 70 lines).
Hunk #9 succeeded at 2236 (offset 70 lines).
Hunk #10 succeeded at 2996 with fuzz 2 (offset 70 lines).
1 out of 10 hunks FAILED -- saving rejects to file backend/postmaster/postmaster.c.rej
patching file backend/storage/smgr/md.c
Hunk #1 succeeded at 162 with fuzz 2.
patching file backend/utils/misc/guc.c
Hunk #1 succeeded at 342 (offset 9 lines).
Hunk #2 succeeded at 1387 (offset 9 lines).
patching file backend/utils/misc/postgresql.conf.sample
Hunk #1 succeeded at 113 (offset 10 lines).
patching file bin/initdb/initdb.c
patching file include/access/xlog.h
patching file include/storage/pmsignal.h


Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Wed, 2004-07-14 at 16:55, markw@osdl.org wrote:
> On 14 Jul, Simon Riggs wrote:
> > PITR Patch v5_1 just posted has Point in Time Recovery working....
> >
> > Still some rough edges....but we really need some testers now to give
> > this a try and let me know what you think.
> >
> > Klaus Naumann and Mark Wong are the only [non-committers] to have tried
> > to run the code (and let me know about it), so please have a look at
> > [PATCHES] and try it out.
> >

> I just tried applying the v5_1 patch against the cvs tip today and got a
> couple of rejections.  I'll copy the patch output here.  Let me know if
> you want to see the reject files or anything else:
>

I'm on it. Sorry 'bout that all - midnight fingers.


Re: [HACKERS] Point in Time Recovery

From
Mark Kirkwood
Date:
I noticed that compiling with 5_1 patch applied fails due to
XLOG_archive_dir being removed from xlog.c , but
src/backend/commands/tablecmds.c still uses it.

I did the following to tablecmds.c :

5408c5408
<               extern char XLOG_archive_dir[];
---
 >               extern char *XLogArchiveDest;
5410c5410
<               use_wal = XLOG_archive_dir[0] && !rel->rd_istemp;
---
 >               use_wal = XLogArchiveDest[0] && !rel->rd_istemp;


Now I have to see if I have broken it with this change :-)

regards

Mark

Simon Riggs wrote:

>On Wed, 2004-07-14 at 16:55, markw@osdl.org wrote:
>
>
>>On 14 Jul, Simon Riggs wrote:
>>
>>
>>>PITR Patch v5_1 just posted has Point in Time Recovery working....
>>>
>>>Still some rough edges....but we really need some testers now to give
>>>this a try and let me know what you think.
>>>
>>>Klaus Naumann and Mark Wong are the only [non-committers] to have tried
>>>to run the code (and let me know about it), so please have a look at
>>>[PATCHES] and try it out.
>>>
>>>
>>>
>
>
>
>>I just tried applying the v5_1 patch against the cvs tip today and got a
>>couple of rejections.  I'll copy the patch output here.  Let me know if
>>you want to see the reject files or anything else:
>>
>>
>>
>
>I'm on it. Sorry 'bout that all - midnight fingers.
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faqs/FAQ.html
>
>

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Thu, 2004-07-15 at 02:43, Mark Kirkwood wrote:
> I noticed that compiling with 5_1 patch applied fails due to
> XLOG_archive_dir being removed from xlog.c , but
> src/backend/commands/tablecmds.c still uses it.
>
> I did the following to tablecmds.c :
>
> 5408c5408
> <               extern char XLOG_archive_dir[];
> ---
>  >               extern char *XLogArchiveDest;
> 5410c5410
> <               use_wal = XLOG_archive_dir[0] && !rel->rd_istemp;
> ---
>  >               use_wal = XLogArchiveDest[0] && !rel->rd_istemp;
>
>

Yes, I discovered that myself.

The fix is included in pitr_v5_2.patch...

Your patch follows the right thinking and looks like it would have
worked...
- XLogArchiveMode carries the main bool value for mode on/off
- XLogArchiveDest might also be used, though best to use the mode

Thanks for looking through the code...

Best Regards, Simon Riggs


Re: [HACKERS] Point in Time Recovery

From
Gaetano Mendola
Date:
Simon Riggs wrote:

> On Wed, 2004-07-14 at 03:31, Christopher Kings-Lynne wrote:
>
>>Can you give us some suggestions of what kind of stuff to test?  Is
>>there a way we can artificially kill the backend in all sorts of nasty
>>spots to see if recovery works?  Does kill -9 simulate a 'power off'?
>>
>
>
> I was hoping some fiendish plans would be presented to me...
>
> But please start with "this feels like typical usage" and we'll go from
> there...the important thing is to try the first one.
>
> I've not done power off tests, yet. They need to be done just to
> check...actually you don't need to do this to test PITR...
>
> We need to exhaustive tests of...
> - power off
> - scp and cross network copies
> - all the permuted recovery options
> - archive_mode = off (i.e. current behaviour)
> - deliberately incorrectly set options (idiot-proof testing)

If you write also how to perform these tests it's also good in order to show
which problem PITR is addressing, I mean I know that is addressing a power off
but how I will recover it ?


Regards
Gaetano Mendola






Re: [HACKERS] Point in Time Recovery

From
Mark Kirkwood
Date:
Here is one for the 'idiot proof' category:

1) initdb and set archive_command
2) shutdown
3) do a backup
4) startup and run some transactions
5) shutdown and remove PGDATA
6) restore backup
7) startup

Obviously this does not work as the backup is performed with the
database shutdown.

This got me wondering for 2 reasons:

1) Some alternative database servers *require* a procedure like this to
enable their version of PITR - so the potential foot-gun thing is there.

2) Is is possible to make the recovery kick in even though pg_control
says the database state is shutdown?

Simon Riggs wrote:

>
>I was hoping some fiendish plans would be presented to me...
>
>But please start with "this feels like typical usage" and we'll go from
>there...the important thing is to try the first one.
>
>I've not done power off tests, yet. They need to be done just to
>check...actually you don't need to do this to test PITR...
>
>We need to exhaustive tests of...
>- power off
>- scp and cross network copies
>- all the permuted recovery options
>- archive_mode = off (i.e. current behaviour)
>- deliberately incorrectly set options (idiot-proof testing)
>
>
>
>

Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Mark Kirkwood <markir@coretech.co.nz> writes:
> Here is one for the 'idiot proof' category:
> 1) initdb and set archive_command
> 2) shutdown
> 3) do a backup
> 4) startup and run some transactions
> 5) shutdown and remove PGDATA
> 6) restore backup
> 7) startup

> Obviously this does not work as the backup is performed with the
> database shutdown.

Huh?  It works fine.

The bit you may be missing is that if you blow away $PGDATA including
pg_xlog/, you won't be able to recover past whatever you have in your WAL
archive area.  The archive is certainly not going to include the current
partially-filled WAL segment, and it might be missing a few earlier
segments if the archival process isn't speedy.  So you need to keep
those recent segments in pg_xlog/ if you want to recover to current time
or near-current time.

I'm becoming more and more convinced that we should bite the bullet and
move pg_xlog/ to someplace that is not under $PGDATA.  It would just
make things a whole lot more reliable, both for backup and to deal with
scenarios like yours above.  I tried to talk Bruce into this on the
phone the other day, but he wouldn't bite.  I still think it's a good
idea though.  It would
(1) eliminate the problem that a tar backup of $PGDATA would restore
    stale copies of xlog segments, because the tar wouldn't include
    pg_xlog in the first place.
(2) eliminate the problem that a naive "rm -rf $PGDATA" would blow away
    xlog segments that you still need.

A possible compromise is that we should strongly suggest that pg_xlog
be pushed out to another place and symlinked if you are going to use
WAL archiving.  That's already considered good practice for performance
if you have a separate disk spindle to put WAL on.  It'll just have
to be good practive for WAL archiving too.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
Bruce Momjian
Date:
I think we should push the partially complete WAL file to the archive
location before shutdown.  I talked to you or Jan about it and you (or
Jan) wouldn't bite either, but I think when someone shuts down, they
assume they have things fully archived and can recover fully with a
previous backup and the archive files.

When you are running and finally fill up the WAL file it would then
overwrite the one in the archive but I think that is OK.  Maybe we would
need to give it a special file extension so we only use it when we don't
have a full version.

---------------------------------------------------------------------------

Tom Lane wrote:
> Mark Kirkwood <markir@coretech.co.nz> writes:
> > Here is one for the 'idiot proof' category:
> > 1) initdb and set archive_command
> > 2) shutdown
> > 3) do a backup
> > 4) startup and run some transactions
> > 5) shutdown and remove PGDATA
> > 6) restore backup
> > 7) startup
>
> > Obviously this does not work as the backup is performed with the
> > database shutdown.
>
> Huh?  It works fine.
>
> The bit you may be missing is that if you blow away $PGDATA including
> pg_xlog/, you won't be able to recover past whatever you have in your WAL
> archive area.  The archive is certainly not going to include the current
> partially-filled WAL segment, and it might be missing a few earlier
> segments if the archival process isn't speedy.  So you need to keep
> those recent segments in pg_xlog/ if you want to recover to current time
> or near-current time.
>
> I'm becoming more and more convinced that we should bite the bullet and
> move pg_xlog/ to someplace that is not under $PGDATA.  It would just
> make things a whole lot more reliable, both for backup and to deal with
> scenarios like yours above.  I tried to talk Bruce into this on the
> phone the other day, but he wouldn't bite.  I still think it's a good
> idea though.  It would
> (1) eliminate the problem that a tar backup of $PGDATA would restore
>     stale copies of xlog segments, because the tar wouldn't include
>     pg_xlog in the first place.
> (2) eliminate the problem that a naive "rm -rf $PGDATA" would blow away
>     xlog segments that you still need.
>
> A possible compromise is that we should strongly suggest that pg_xlog
> be pushed out to another place and symlinked if you are going to use
> WAL archiving.  That's already considered good practice for performance
> if you have a separate disk spindle to put WAL on.  It'll just have
> to be good practive for WAL archiving too.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Point in Time Recovery

From
Mark Kirkwood
Date:
Well that is interesting :_)


Here is what I am doing on the removal front (I am keeping pg_xlog *now*):

$ cd $PGDATA
$ pg_ctl stop
$ ls|grep -v pg_xlog|xargs rm -rf

The contents of the archive directory just before recovery starts:

$ ls -l $PGDATA/../7.5-archive
total 49212
-rw-------    1 postgres postgres 16777216 Jul 22 14:59
000000010000000000000000
-rw-------    1 postgres postgres 16777216 Jul 22 14:59
000000010000000000000001
-rw-------    1 postgres postgres 16777216 Jul 22 14:59
000000010000000000000002

But here is recovery startup log:

LOG:  database system was shut down at 2004-07-22 14:58:57 NZST
LOG:  starting archive recovery
LOG:  restore_command = "cp /data1/pgdata/7.5-archive/%f %p"
cp: cannot stat `/data1/pgdata/7.5-archive/00000001.history': No such
file or directory
LOG:  restored log file "000000010000000000000000" from archive
LOG:  checkpoint record is at 0/A4D3E8
LOG:  redo record is at 0/A4D3E8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 496; next OID: 17229
LOG:  archive recovery complete
LOG:  database system is ready

regards

Mark

Tom Lane wrote:

>
>Huh?  It works fine.
>
>The bit you may be missing is that if you blow away $PGDATA including
>pg_xlog/, you won't be able to recover past whatever you have in your WAL
>archive area.  The archive is certainly not going to include the current
>partially-filled WAL segment, and it might be missing a few earlier
>segments if the archival process isn't speedy.  So you need to keep
>those recent segments in pg_xlog/ if you want to recover to current time
>or near-current time.
>
>
>
>

Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I think we should push the partially complete WAL file to the archive
> location before shutdown. ...
> When you are running and finally fill up the WAL file it would then
> overwrite the one in the archive but I think that is OK.

I don't think this can fly at all.  Here are some off-the-top-of-the-head
objections:

1. We don't have the luxury of spending indefinite amounts of time to
do a database shutdown.  Commonly we are under a twenty-second sentence
of death from init.  I don't want to spend the 20 seconds waiting to see
if the archiver will manage to push 16MB onto a slow tape drive.  Also,
if the archiver does fail to push the data in time, it'll likely leave a
broken (partial) xlog file in the archive, which would be really bad
news if the user then relies on that.

2. What if the archiver process entirely fails to push the file?  (Maybe
there's not enough disk space, for instance.)  In normal operation we'll
just retry every so often.  We definitely can't do that during shutdown.

3. You're blithely assuming that the archival process can easily provide
overwrite semantics for multiple pushes of the same xlog filename.  Stop
thinking about "cp to some directory" and start thinking "dump to tape"
or "burn onto CD" or something like that.  We'll be raising the ante
considerably if we require the archive_command to deal with this.

I think the last one is really the most significant issue.  We have to
keep the archiver API as simple as possible.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
Bruce Momjian
Date:
Agreed, it might not be possible, but your report does point out a
limitation in our implementation --- that a shutdown database contains
more information than a backup and the archive logs.  That is not
intuitive.

In fact, if you shutdown your database and want to reproduce it on
another machine, how do you do it?  Seems you have to copy pg_xlog
directory over to the new machine.

In fact, moving pg_xlog to a new location doesn't make that clear
either.  Seems documentation might be the only way to make this clear.

One idea would be to just push the partial WAL file to the archive on
server shutdown and not reuse it and start with a new WAL file on
startup.  At least for a normal system shutdown this will give us an
archive that contains all the information that is in pg_xlog.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think we should push the partially complete WAL file to the archive
> > location before shutdown. ...
> > When you are running and finally fill up the WAL file it would then
> > overwrite the one in the archive but I think that is OK.
>
> I don't think this can fly at all.  Here are some off-the-top-of-the-head
> objections:
>
> 1. We don't have the luxury of spending indefinite amounts of time to
> do a database shutdown.  Commonly we are under a twenty-second sentence
> of death from init.  I don't want to spend the 20 seconds waiting to see
> if the archiver will manage to push 16MB onto a slow tape drive.  Also,
> if the archiver does fail to push the data in time, it'll likely leave a
> broken (partial) xlog file in the archive, which would be really bad
> news if the user then relies on that.
>
> 2. What if the archiver process entirely fails to push the file?  (Maybe
> there's not enough disk space, for instance.)  In normal operation we'll
> just retry every so often.  We definitely can't do that during shutdown.
>
> 3. You're blithely assuming that the archival process can easily provide
> overwrite semantics for multiple pushes of the same xlog filename.  Stop
> thinking about "cp to some directory" and start thinking "dump to tape"
> or "burn onto CD" or something like that.  We'll be raising the ante
> considerably if we require the archive_command to deal with this.
>
> I think the last one is really the most significant issue.  We have to
> keep the archiver API as simple as possible.
>
>             regards, tom lane
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Agreed, it might not be possible, but your report does point out a
> limitation in our implementation --- that a shutdown database contains
> more information than a backup and the archive logs.  That is not
> intuitive.

That's only because you are clinging to the broken assumption that
pg_xlog/ is part of the database, rather than part of the logs.
Separate that out as a distinct entity, and all gets better.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Agreed, it might not be possible, but your report does point out a
> > limitation in our implementation --- that a shutdown database contains
> > more information than a backup and the archive logs.  That is not
> > intuitive.
>
> That's only because you are clinging to the broken assumption that
> pg_xlog/ is part of the database, rather than part of the logs.
> Separate that out as a distinct entity, and all gets better.

Imagine this.   I stop the server.  I have a tar backup and a copy of
the archive.  I should be able to take them to another machine and
recover the system to the point I stopped.

You are saying I need a copy of pg_xlog directory too, and I need to
remove pg_xlog after I untar the data directory and put the saved
pg_xlog into there before I recover.

Should we create a server-side function that forces all WAL files to the
archive, including partially written ones.  Maybe that fixes the problem
with people deleting pg_xlog before they untar.  You tell them to run
the function before recovery.  If the system can't be started, the it is
possible the WAL files are no good too, not sure.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Thu, 2004-07-22 at 04:29, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think we should push the partially complete WAL file to the archive
> > location before shutdown. ...
> > When you are running and finally fill up the WAL file it would then
> > overwrite the one in the archive but I think that is OK.
>
> I don't think this can fly at all.  Here are some off-the-top-of-the-head
> objections:
>
> 1. We don't have the luxury of spending indefinite amounts of time to
> do a database shutdown.  Commonly we are under a twenty-second sentence
> of death from init.  I don't want to spend the 20 seconds waiting to see
> if the archiver will manage to push 16MB onto a slow tape drive.  Also,
> if the archiver does fail to push the data in time, it'll likely leave a
> broken (partial) xlog file in the archive, which would be really bad
> news if the user then relies on that.
>
> 2. What if the archiver process entirely fails to push the file?  (Maybe
> there's not enough disk space, for instance.)  In normal operation we'll
> just retry every so often.  We definitely can't do that during shutdown.
>
> 3. You're blithely assuming that the archival process can easily provide
> overwrite semantics for multiple pushes of the same xlog filename.  Stop
> thinking about "cp to some directory" and start thinking "dump to tape"
> or "burn onto CD" or something like that.  We'll be raising the ante
> considerably if we require the archive_command to deal with this.
>
> I think the last one is really the most significant issue.  We have to
> keep the archiver API as simple as possible.
>

Not read whole chain of conversation...but this idea came up before and
was rejected then. I agree with the 3 objections to that thought above.

There's already enough copies of full xlogs around to worry about.

If you need more granularity, reduce size of xlog files....

(Tom, SUID would be the correct timeline id in that situation? )

More later, Simon Riggs


Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Mark Kirkwood <markir@coretech.co.nz> writes:
> 2) Is is possible to make the recovery kick in even though pg_control
> says the database state is shutdown?

Yeah, I think you are right: presence of recovery.conf should force a
WAL scan even if pg_control claims it's shut down.  Fix committed.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
Mark Kirkwood
Date:
Excellent - Just updated and it is all good!

This change makes the whole "how do I do my backup" business nice and
basic - which the right way IMHO.

regards

Mark


Tom Lane wrote:

>Mark Kirkwood <markir@coretech.co.nz> writes:
>
>
>>2) Is is possible to make the recovery kick in even though pg_control
>>says the database state is shutdown?
>>
>>
>
>Yeah, I think you are right: presence of recovery.conf should force a
>WAL scan even if pg_control claims it's shut down.  Fix committed.
>
>            regards, tom lane
>
>

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Thu, 2004-07-22 at 21:19, Tom Lane wrote:
> Mark Kirkwood <markir@coretech.co.nz> writes:
> > 2) Is is possible to make the recovery kick in even though pg_control
> > says the database state is shutdown?
>
> Yeah, I think you are right: presence of recovery.conf should force a
> WAL scan even if pg_control claims it's shut down.  Fix committed.
>

This *should* be possible but I haven't tested it.

There is a code path on secondary checkpoints that indicates that crash
recovery can occur even when the database was shutdown, since the code
forces recovery whether it was or not. On that basis, this may work, but
is yet untested. I didn't mention this because it might interfere with
getting hot backup to work...

Best Regards, Simon Riggs


Re: [HACKERS] Point in Time Recovery

From
Mark Kirkwood
Date:
I have tested the "cold" backup - and retested my previous scenarios
using "hot" backup (just to be sure) . They all work AFAICS!

cheers

Mark

Simon Riggs wrote:

>On Thu, 2004-07-22 at 21:19, Tom Lane wrote:
>
>
>>Mark Kirkwood <markir@coretech.co.nz> writes:
>>
>>
>>>2) Is is possible to make the recovery kick in even though pg_control
>>>says the database state is shutdown?
>>>
>>>
>>Yeah, I think you are right: presence of recovery.conf should force a
>>WAL scan even if pg_control claims it's shut down.  Fix committed.
>>
>>
>>
>
>This *should* be possible but I haven't tested it.
>
>There is a code path on secondary checkpoints that indicates that crash
>recovery can occur even when the database was shutdown, since the code
>forces recovery whether it was or not. On that basis, this may work, but
>is yet untested. I didn't mention this because it might interfere with
>getting hot backup to work...
>
>Best Regards, Simon Riggs
>
>
>

Re: [HACKERS] Point in Time Recovery

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Thu, 2004-07-22 at 21:19, Tom Lane wrote:
>> Yeah, I think you are right: presence of recovery.conf should force a
>> WAL scan even if pg_control claims it's shut down.  Fix committed.

> This *should* be possible but I haven't tested it.

I did.

It's really not risky.  The fact that the code doesn't look beyond the
checkpoint record when things seem to be kosher is just a speed
optimization (and probably a rather pointless one...)  We have got to be
able to detect the end of WAL in any case, so we'd just find there are
no more records and stop.

            regards, tom lane

Re: [HACKERS] Point in Time Recovery

From
Simon Riggs
Date:
On Fri, 2004-07-23 at 01:05, Mark Kirkwood wrote:
> I have tested the "cold" backup - and retested my previous scenarios
> using "hot" backup (just to be sure) . They all work AFAICS!

> cheers

Yes, I'll drink to that! Thanks for your help.

Best Regards, Simon Riggs





Re: [HACKERS] Point in Time Recovery

From
Bruce Momjian
Date:
Here is another open PITR issue that I think will have to be addressed
in 7.6.  If you do a critical transaction, but do nothing else for eight
hours, that critical transaction hasn't been archived yet.  It is still
sitting in pg_xlog until the WAL file fills.

I think we will need to document this behavior and address it in some
way in 7.6.  We can't assume that we can send multiple copies of pg_xlog
to the archive (partial and full ones) because we might be going to a
tape drive.  However, this is a non-intuitive behavior of our archiver.
We might need to tell people to archive the most recent WAL file every
minute to some other location or something.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think we should push the partially complete WAL file to the archive
> > location before shutdown. ...
> > When you are running and finally fill up the WAL file it would then
> > overwrite the one in the archive but I think that is OK.
>
> I don't think this can fly at all.  Here are some off-the-top-of-the-head
> objections:
>
> 1. We don't have the luxury of spending indefinite amounts of time to
> do a database shutdown.  Commonly we are under a twenty-second sentence
> of death from init.  I don't want to spend the 20 seconds waiting to see
> if the archiver will manage to push 16MB onto a slow tape drive.  Also,
> if the archiver does fail to push the data in time, it'll likely leave a
> broken (partial) xlog file in the archive, which would be really bad
> news if the user then relies on that.
>
> 2. What if the archiver process entirely fails to push the file?  (Maybe
> there's not enough disk space, for instance.)  In normal operation we'll
> just retry every so often.  We definitely can't do that during shutdown.
>
> 3. You're blithely assuming that the archival process can easily provide
> overwrite semantics for multiple pushes of the same xlog filename.  Stop
> thinking about "cp to some directory" and start thinking "dump to tape"
> or "burn onto CD" or something like that.  We'll be raising the ante
> considerably if we require the archive_command to deal with this.
>
> I think the last one is really the most significant issue.  We have to
> keep the archiver API as simple as possible.
>
>             regards, tom lane
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Point in Time Recovery

From
JEDIDIAH
Date:
On 2004-07-28, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
>
> Here is another open PITR issue that I think will have to be addressed
> in 7.6.  If you do a critical transaction, but do nothing else for eight
> hours, that critical transaction hasn't been archived yet.  It is still
> sitting in pg_xlog until the WAL file fills.
>
> I think we will need to document this behavior and address it in some
> way in 7.6.  We can't assume that we can send multiple copies of pg_xlog
> to the archive (partial and full ones) because we might be going to a

    If a particular transaction is so important that it absolutely
positively needs to be archived offline for PITR, then why not just mark
it that way or allow for the application to trigger archival of this
critical REDO?

> tape drive.  However, this is a non-intuitive behavior of our archiver.
> We might need to tell people to archive the most recent WAL file every
> minute to some other location or something.

[deletia]

--
        Negligence will never equal intent, no matter how you
attempt to distort reality to do so. This is what separates         |||
the real butchers from average Joes (or Fritzes) caught up in      / | \
events not in their control.