Re: incremental backups - Mailing list pgsql-general

From Bruce Momjian
Subject Re: incremental backups
Date
Msg-id 200602241412.k1OECC121000@candle.pha.pa.us
Whole thread Raw
In response to Re: incremental backups  (Rick Gigger <rick@alpinenetworking.com>)
List pgsql-general
I have applied the following patch adds to the paragraph after the one
you quoted below.  I just added mention that the start/stop time _and_
wal file names are in the history file.

---------------------------------------------------------------------------

Rick Gigger wrote:
> I've started writing some scripts to set up incremental backup to my
> taste. I just discovered something and thought I would revisit this
> thread briefly.
>
> When you go to restore from a give base file system backup you need
> to know  the start WAL file that you need and the end WAL file that
> you need.  (You will most likely have many files beyond the "stop"
> file but you must have at least up to the "stop" file for the restore
> to work.
>
> Now if you try to restore but you don't have the "stop" WAL file
> postges will die on recovery and tell you that it can't recover
> forward far enough to make the backup consistent.  But I wanted to
> know the easiest way to verify if you indeed had the necessary files
> without having to actually do a restore and have postgres tell you if
> it succeeded or not.
>
> Perhaps no one understood me because the answer I was looking for was
> too obvious.   But what I really wanted to know was how do you know
> what the "stop" file is.  It informs you of the start file all over
> the place when doing the base backups but I thought I would have to
> do something clever to figure out the stop file on my own.  But
> luckily I don't.  The backup history file has too lines like this:
>
> START WAL LOCATION: 0/88F21D0C (file 000000010000000000000088)
> STOP WAL LOCATION: 0/88F21D50 (file 000000010000000000000088)
>
> It was clear to me from the docs how to figure out what the start
> file is but the end file was a mystery until I actually created a
> backup history file and looked in it.  The only place I can find in
> the Online Backup instructions where this is indicated is this
> paragraph:
>
> "To make use of this backup, you will need to keep around all the WAL
> segment files generated during and after the file system backup. To
> aid you in doing this, the pg_stop_backup function creates a backup
> history file that is immediately stored into the WAL archive area.
> This file is named after the first WAL segment file that you need to
> have to make use of the backup. For example, if the starting WAL file
> is 0000000100001234000055CD the backup history file will be named
> something like 0000000100001234000055CD.007C9330.backup. (The second
> number in the file name stands for an exact position within the WAL
> file, and can ordinarily be ignored.) Once you have safely archived
> the file system backup and the WAL segment files used during the
> backup (as specified in the backup history file), all archived WAL
> segments with names numerically less are no longer needed to recover
> the file system backup and may be deleted. However, you should
> consider keeping several backup sets to be absolutely certain that
> you can recover your data. Keep in mind that only completed WAL
> segment files are archived, so there will be delay between running
> pg_stop_backup and the archiving of all WAL segment files needed to
> make the file system backup consistent."
>
> Reading it now it seems obvious that the file would contain not only
> the start WAL file but also the Stop WAL file but when going over the
> directions the first time it did not pick up on it.  And it left me
> thinking I would have to use some hack to figure it out if I ever
> wanted to test a base backup.  It would have been less confusing to
> me if it just said right in the docs: "The backup history file
> contains both the start WAL file name and the Stop WAL file name" or
> something like that just to make it perfectly clear.
>
> Now that I know this I can extract that filename from the backup
> history file, check to see if it has been archived and copy it
> somewhere if it hasn't been archived yet.  I'm pretty sure that I can
> assume that all files before the stop file have already been
> archived.  So once I backup the stop file I can be positive that the
> base backup I just made will be valid when I try to restore from it.
>
> This lessens my need for the "get current WAL file" functionality in
> this context.  It will still be nice to have in the context of
> backing it up every five minutes or so in case a WAL file takes a
> long time to fill up.
>
> Anyway I would have been less confused if the docs had made it more
> clear that the name of the stop wal file was in the backup history file.
>
> Rick
>
>
> On Jan 30, 2006, at 10:20 PM, Bruce Momjian wrote:
>
> >
> > Yes, I think copying it while it is being written is safe.
> >
> > ----------------------------------------------------------------------
> > -----
> >
> > Rick Gigger wrote:
> >> Yes!  Thanks you!  That is exactly what I was looking for.
> >>
> >> So I take it that this means that it is save to copy the current in
> >> use WAL file even as it is being written to?
> >> And it also means that if I copy it with my physical file system
> >> backup then I should have the last file that I need to restore from
> >> that physical backup?
> >>
> >> So if I write my own backup_latest_WAL_file.sh script (I think I
> >> found one on the list from Simon Riggs) then I can do what I need to
> >> do before those todo items get done?  Or will I need to wait till
> >> postgres gives me the ability to safely copy the file?
> >>
> >>
> >>
> >> On Jan 30, 2006, at 11:13 AM, Bruce Momjian wrote:
> >>
> >>>
> >>> Unfortunately, I think I understand your question.  :-)
> >>>
> >>> These TODO items are what you need:
> >>>
> >>>     * Point-In-Time Recovery (PITR)
> >>>
> >>>           o Allow point-in-time recovery to archive partially filled
> >>>             write-ahead logs [pitr]
> >>>
> >>>             Currently only full WAL files are archived. This means
> >>> that the
> >>>             most recent transactions aren't available for recovery
> >>> in case
> >>>             of a disk failure. This could be triggered by a user
> >>> command or
> >>>             a timer.
> >>>
> >>>           o Automatically force archiving of partially-filled WAL
> >>> files when
> >>>             pg_stop_backup() is called or the server is stopped
> >>>
> >>>             Doing this will allow administrators to know more
> >>> easily when
> >>>             the archive contains all the files needed for point-in-
> >>> time
> >>>             recovery.
> >>>
> >>> I will try to push to have them done for 8.2.
> >>>
> >>> --------------------------------------------------------------------
> >>> --
> >>> -----
> >>>
> >>> Rick Gigger wrote:
> >>>> I guess my email wasn't all that clear.  I will try to rephrase.  I
> >>>> am moving from using the old style pg_dump for backups to using
> >>>> incrementals and want to make sure I understand the process
> >>>> before I
> >>>> go about writing a bunch of scritps.
> >>>>
> >>>> To me setting up incremental backup consists of the following
> >>>> components:
> >>>>
> >>>> 1) Setting up the WAL archiving.  This one is trivial.
> >>>> 2) Doing physical dumps of the $PGDATA directory.  This one is once
> >>>> again trivial.
> >>>> 3) Knowing which physical dumps are Good and Not Good.  For a given
> >>>> physical dump D there is are WAL archive files Dstart and Dend for
> >>>> which you much have Dstart and Dend and all files in between.
> >>>> If you
> >>>> have all those files then the physical dump is Good.  If you don't
> >>>> have them then the dump is worthless to you.
> >>>> 4) Knowing which dumps and which archive files can be deleted.
> >>>> This
> >>>> depends on a number of factors.
> >>>>     a) How far back do you want to be able to do PITR
> >>>>     b) How much space do you have / want to use for PITR
> >>>>     c) Which physical dumps are Good and which are Not Good. (see #3)
> >>>>
> >>>> Now I think I have a pretty good plan here except for #3 (and so #4
> >>>> then also suffers).
> >>>>
> >>>> Just as an example lets say I'm not concerned so much with PITR
> >>>> as I
> >>>> am recovering from a db crash. I've got all the backups files saved
> >>>> to my backup db server so I can failover to it if my primary db
> >>>> server dies.  I just want to make sure I've got one physical dump
> >>>> that is good.  (This is not my actual situation but it
> >>>> illustrated my
> >>>> point better.)
> >>>>
> >>>> Now when I do a physical dump it is not a Good dump.  That is I
> >>>> don't
> >>>> have the end archive file necessary to recover from that physical
> >>>> dump.  That is to say that  when I call pg_backup_start() then copy
> >>>> $PGDATA then call pg_backup_stop() postgres might be on say WAL
> >>>> archive file #5.  Once the physical dump is completed WAL archive
> >>>> file #5 hasn't been archived yet.  I only have up to #4.  So if I
> >>>> delete my old physical dumps and all I've got is this most
> >>>> recent one
> >>>> and my database crashes before #5 gets archived then I am hosed.  I
> >>>> have no good physical backups to start from.
> >>>>
> >>>> My main question is about the best way to figure out when a
> >>>> physical
> >>>> dump is Good.
> >>>>
> >>>> One strategy is to always keep around lots of physical dumps.
> >>>> If you
> >>>> keep around 100 dumps you can be pretty sure that in the space of
> >>>> time that those physical dumps take place that at least one WAL
> >>>> file
> >>>> was archived.  In fact if you keep 2 physical dumps you can be
> >>>> fairly
> >>>> certain of this.  If not then you really need to space our your
> >>>> dumps
> >>>> more.
> >>>>
> >>>> Is this making sense at this point?
> >>>>
> >>>> The problem is that the WAL archiving is triggered by postgres and
> >>>> the rate at which the db is updated.  The physical dumps are
> >>>> triggered by cron and on a purely time based schedule.  So in
> >>>> theory
> >>>> if you had the physical dumps happening once a day but for some odd
> >>>> reason no one updated the database for 4 days then all of a sudden
> >>>> you'd have 2 physical backups and neither of them are good.  If
> >>>> you're db crashes during that time you are hosed.
> >>>>
> >>>> Maybe I am arguing a point that is just stupid because this will
> >>>> never happen in real life.  But in that it is my backups system
> >>>> that
> >>>> I will be using to recover from complete and total disaster I just
> >>>> want to have all my bases covered.
> >>>>
> >>>> So my ideas on how to determine if a physical dump is Good are as
> >>>> follows.
> >>>>
> >>>> 1) When you do the physical backup (after dumping the $PGDATA
> >>>> dir but
> >>>> before calling pg_stop_backup() ) determine the current WAL archive
> >>>> file.  Mark somewhere in the backed up physical dump the last file
> >>>> needed for the dump to be considered good.  Then your deletion
> >>>> scripts can look at the WAL archive files you have and the last one
> >>>> required for the dump to be Good and determine if the dump is
> >>>> Good or
> >>>> not.
> >>>>
> >>>> 2) After doing the physical dump but before calling
> >>>> pg_stop_backup()
> >>>> just copy the current WAL file to the physical dump.  If that file
> >>>> later gets archived then the restore commands overwrites your
> >>>> partially completed one so it doesn't hurt but you know that
> >>>> when you
> >>>> call pg_stop_backup() that that physical dump is good.  (Is it
> >>>> ok to
> >>>> copy the current WAL file while it is still in use?)
> >>>>
> >>>> Is anyone taking one of these or any other precautions to make sure
> >>>> they've got a good physical dump or does everyone just keep a whole
> >>>> bunch of dumps around, and then actually restore the dump to see if
> >>>> it is good and if not go back to a previous dump?
> >>>>
> >>>> I hope that makes more sense.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Rick
> >>>>
> >>>> On Jan 27, 2006, at 3:33 AM, Richard Huxton wrote:
> >>>>
> >>>>> Rick Gigger wrote:
> >>>>>> Um, no you didn't read my email at all.  I am aware of all of
> >>>>>> that
> >>>>>> and it is clearly outlined in the docs.  My email was about a
> >>>>>> specific detail in the process.  Please read it if you want to
> >>>>>> know what my actual question was.
> >>>>>
> >>>>> I'm not sure your email is quite right as regards the process. You
> >>>>> need:
> >>>>>   1. the filesystem backup
> >>>>>   2. the WAL file indicated in the history-file
> >>>>>   3. all the WAL files later than that
> >>>>> to get up to "now".
> >>>>>
> >>>>> If you don't want to replay up to "now" then you will not need
> >>>>> some
> >>>>> of the more recent WAL files. You can't afford to throw them away
> >>>>> though since you've got a rolling backup system running and the
> >>>>> whole point is so you can recover to any point you like.
> >>>>>
> >>>>> You can however throw away any WAL files older than that indicated
> >>>>> in the history file for your current filesystem-backup. You can
> >>>>> then only restore from that point in time forward.
> >>>>>
> >>>>> There is no "last one" in the WAL set unless you know the time you
> >>>>> want to restore to. Indeed, the "last one" might not be "full" yet
> >>>>> and therefore archived if you want to restore to 10 seconds ago.
> >>>>>
> >>>>> Or am I mis-understanding your email too?
> >>>>>
> >>>>> --
> >>>>>   Richard Huxton
> >>>>>   Archonet Ltd
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------(end of
> >>>> broadcast)---------------------------
> >>>> TIP 4: Have you searched our list archives?
> >>>>
> >>>>                http://archives.postgresql.org
> >>>>
> >>>
> >>> --
> >>>   Bruce Momjian                        |  http://candle.pha.pa.us
> >>>   pgman@candle.pha.pa.us               |  (610) 359-1001
> >>>   +  If your life is a hard drive,     |  13 Roberts Road
> >>>   +  Christ can be your backup.        |  Newtown Square,
> >>> Pennsylvania 19073
> >>>
> >>
> >>
> >> ---------------------------(end of
> >> broadcast)---------------------------
> >> TIP 4: Have you searched our list archives?
> >>
> >>                http://archives.postgresql.org
> >>
> >
> > --
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square,
> > Pennsylvania 19073
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian   http://candle.pha.pa.us
  SRA OSS, Inc.   http://www.sraoss.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.76
diff -c -c -r2.76 backup.sgml
*** doc/src/sgml/backup.sgml    7 Nov 2005 17:36:44 -0000    2.76
--- doc/src/sgml/backup.sgml    24 Feb 2006 14:01:51 -0000
***************
*** 744,755 ****
      <function>pg_stop_backup</> and the archiving of all WAL segment
      files needed to make the file system backup consistent.
     </para>
     <para>
      The backup history file is just a small text file. It contains the
      label string you gave to <function>pg_start_backup</>, as well as
!     the starting and ending times of the backup. If you used the label
!     to identify where the associated dump file is kept, then the
!     archived history file is enough to tell you which dump file to
      restore, should you need to do so.
     </para>

--- 744,756 ----
      <function>pg_stop_backup</> and the archiving of all WAL segment
      files needed to make the file system backup consistent.
     </para>
+
     <para>
      The backup history file is just a small text file. It contains the
      label string you gave to <function>pg_start_backup</>, as well as
!     the starting and ending times and WAL segments of the backup.
!     If you used the label to identify where the associated dump file is kept,
!     then the archived history file is enough to tell you which dump file to
      restore, should you need to do so.
     </para>


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Does PG really lack a time zone for India?
Next
From: "K.Deepa"
Date:
Subject: Operator for int8 array