Re: Problem with PITR recovery - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Problem with PITR recovery
Date
Msg-id 200504170306.j3H36Hr01998@candle.pha.pa.us
Whole thread Raw
In response to Re: Problem with PITR recovery  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Problem with PITR recovery
Re: Problem with PITR recovery
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > The problem is that we don't archive the partially written xlog file,
> > and in this case that xlog file contains the information needed to make
> > the tar file consistent.
>
> > Is this a known problem?  Do we document this?  If so, I can't find it.
>
> Yes, and yes.  You did not follow the procedure:
>
> http://www.postgresql.org/docs/8.0/static/backup-online.html#BACKUP-PITR-RECOVERY
>
> In particular, step 2 says:
>
> : ... you need at the least to copy the contents of the pg_xlog
> : subdirectory of the cluster data directory, as it may contain logs which
> : were not archived before the system went down.
>
> Possibly this needs to be highlighted a little better.

I figured that part of the goal of PITR was that you could recover from
just the tar backup and archived WAL files --- using the pg_xlog
contents is nice, but not something we can require.

I understood the last missing WAL log would cause missing information,
but not that it would make the tar backup unusable.

It would be nice if we could force a new WAL file on pg_stop_backup()
and archive the WAL file needed to match the tar file.  How hard would
that be?

I see in the docs:

    To make use of this backup, you will need to keep around all the WAL
    segment files generated at or after the starting time of the backup. To
    aid you in doing this, the pg_stop_backup function creates a backup
    history file that is immediately stored into the WAL archive area. This
    file is named after the first WAL segment file that you need to have to
    make use of the backup. For example, if the starting WAL file is
    0000000100001234000055CD the backup history file will be named something
    like 0000000100001234000055CD.007C9330.backup. (The second part of this
    file name stands for an exact position within the WAL file, and can
    ordinarily be ignored.) Once you have safely archived the backup dump
    file, you can delete all archived WAL segments with names numerically
    preceding this one.

I am not clear on what the "backup dump file" is?  I assume it means
0000000100001234000055CD.  It is called "WAL segment file" above.  I
will rename that phrase to match the above terminology.  Patch attached
and applied.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.60
diff -c -c -r2.60 backup.sgml
*** doc/src/sgml/backup.sgml    23 Mar 2005 19:38:53 -0000    2.60
--- doc/src/sgml/backup.sgml    17 Apr 2005 03:04:35 -0000
***************
*** 733,740 ****
      the backup history file will be named something like
      <literal>0000000100001234000055CD.007C9330.backup</>.  (The second part of
      this file name stands for an exact position within the WAL file, and can
!     ordinarily be ignored.)  Once you have safely archived the backup dump
!     file, you can delete all archived WAL segments with names numerically
      preceding this one.  The backup history file is just a small text file.
      It contains the label string you gave to <function>pg_start_backup</>, as
      well as the starting and ending times of the backup.  If you used the
--- 733,740 ----
      the backup history file will be named something like
      <literal>0000000100001234000055CD.007C9330.backup</>.  (The second part of
      this file name stands for an exact position within the WAL file, and can
!     ordinarily be ignored.)  Once you have safely archived this WAL
!     segment file, you can delete all archived WAL segments with names numerically
      preceding this one.  The backup history file is just a small text file.
      It contains the label string you gave to <function>pg_start_backup</>, as
      well as the starting and ending times of the backup.  If you used the

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: argtype_inherit() is dead code
Next
From: Ragnar Hafstað
Date:
Subject: Re: Problem with PITR recovery