Thread: Incrementally Updated Backup

Incrementally Updated Backup

From

Simon Riggs

Date:

19 September 2006, 11:10:43

Way past feature freeze, but this small change allows a powerful new
feature utilising the Restartable Recovery capability. Very useful for
very large database backups...

Includes full documentation.

Perhaps a bit rushed, but inclusion in 8.2 would be great. (Ouch, don't
shout back, read the patch first....)

-----------------------------
Docs copied here as better explanation:

   <title>Incrementally Updated Backups</title>

   <para>
    Restartable Recovery can also be utilised to avoid the need to take
    regular complete base backups, thus improving backup performance in
    situations where the server is heavily loaded or the database is
    very large.  This concept is known as incrementally updated backups.
   </para>

   <para>
    If we take a backup of the server files after a recovery is
partially
    completed, we will be able to restart the recovery from the last
    restartpoint. This backup is now further forward along the timeline
    than the original base backup, so we can refer to it as an
incrementally
    updated backup. If we need to recover, it will be faster to recover
from
    the incrementally updated backup than from the base backup.
   </para>

   <para>
    The <xref linkend="startup-after-recovery"> option in the
recovery.conf
    file is provided to allow the recovery to complete up to the current
last
    WAL segment, yet without starting the database. This option allows
us
    to stop the server and take a backup of the partially recovered
server
    files: this is the incrementally updated backup.
   </para>

   <para>
    We can use the incrementally updated backup concept to come up with
a
    streamlined backup schedule. For example:
  <orderedlist>
   <listitem>
    <para>
     Set up continuous archiving
    </para>
   </listitem>
   <listitem>
    <para>
     Take weekly base backup
    </para>
   </listitem>
   <listitem>
    <para>
     After 24 hours, restore base backup to another server, then run a
     partial recovery and take a backup of the latest database state to
     produce an incrmentally updated backup.
    </para>
   </listitem>
   <listitem>
    <para>
     After next 24 hours, restore the incrementally updated backup to
the
     second server, then run a partial recovery, at the end, take a
backup
     of the partially recovered files.
    </para>
   </listitem>
   <listitem>
    <para>
     Repeat previous step each day, until the end of the week.
    </para>
   </listitem>
  </orderedlist>
   </para>

   <para>
    A weekly backup need only be taken once per week, yet the same level
of
    protection is offered as if base backups were taken nightly.
   </para>

  </sect2>

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com

Attachment

iubackup.patch

Re: [HACKERS] Incrementally Updated Backup

From

Heikki Linnakangas

Date:

19 September 2006, 12:05:06

Simon Riggs wrote:
> +
> + if (startupAfterRecovery)
> + ereport(ERROR,
> + (errmsg("recovery ends normally with startup_after_recovery=false")));
> +

I find this part of the patch a bit ugly. Isn't there a better way to
exit than throwing an error that's not really an error?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Incrementally Updated Backup

From

Tom Lane

Date:

19 September 2006, 13:13:32

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Simon Riggs wrote:
>> +
>> + if (startupAfterRecovery)
>> + ereport(ERROR,
>> + (errmsg("recovery ends normally with startup_after_recovery=false")));
>> +

> I find this part of the patch a bit ugly. Isn't there a better way to
> exit than throwing an error that's not really an error?

This patch has obviously been thrown together with no thought and even
less testing.  It breaks the normal case (I think the above if-test is
backwards), and I don't believe that it works for the advertised purpose
either (because nothing gets done to force a checkpoint before aborting,
thus the files on disk are not up to date with the end of WAL).

Also, I'm not sold that the concept is even useful.  Apparently the idea
is to offload the expense of taking periodic base backups from a master
server, by instead backing up a PITR slave's fileset --- which is fine.
But why in the world would you want to stop the slave to do it?  ISTM
we would want to arrange things so that you can copy the slave's files
while it continues replicating, just as with a standard base backup.

            regards, tom lane

Re: Incrementally Updated Backup

From

Bruce Momjian

Date:

19 September 2006, 15:55:31

No, too late.

---------------------------------------------------------------------------

Simon Riggs wrote:
>
> Way past feature freeze, but this small change allows a powerful new
> feature utilising the Restartable Recovery capability. Very useful for
> very large database backups...
>
> Includes full documentation.
>
> Perhaps a bit rushed, but inclusion in 8.2 would be great. (Ouch, don't
> shout back, read the patch first....)
>
> -----------------------------
> Docs copied here as better explanation:
>
>    <title>Incrementally Updated Backups</title>
>
>    <para>
>     Restartable Recovery can also be utilised to avoid the need to take
>     regular complete base backups, thus improving backup performance in
>     situations where the server is heavily loaded or the database is
>     very large.  This concept is known as incrementally updated backups.
>    </para>
>
>    <para>
>     If we take a backup of the server files after a recovery is
> partially
>     completed, we will be able to restart the recovery from the last
>     restartpoint. This backup is now further forward along the timeline
>     than the original base backup, so we can refer to it as an
> incrementally
>     updated backup. If we need to recover, it will be faster to recover
> from
>     the incrementally updated backup than from the base backup.
>    </para>
>
>    <para>
>     The <xref linkend="startup-after-recovery"> option in the
> recovery.conf
>     file is provided to allow the recovery to complete up to the current
> last
>     WAL segment, yet without starting the database. This option allows
> us
>     to stop the server and take a backup of the partially recovered
> server
>     files: this is the incrementally updated backup.
>    </para>
>
>    <para>
>     We can use the incrementally updated backup concept to come up with
> a
>     streamlined backup schedule. For example:
>   <orderedlist>
>    <listitem>
>     <para>
>      Set up continuous archiving
>     </para>
>    </listitem>
>    <listitem>
>     <para>
>      Take weekly base backup
>     </para>
>    </listitem>
>    <listitem>
>     <para>
>      After 24 hours, restore base backup to another server, then run a
>      partial recovery and take a backup of the latest database state to
>      produce an incrmentally updated backup.
>     </para>
>    </listitem>
>    <listitem>
>     <para>
>      After next 24 hours, restore the incrementally updated backup to
> the
>      second server, then run a partial recovery, at the end, take a
> backup
>      of the partially recovered files.
>     </para>
>    </listitem>
>    <listitem>
>     <para>
>      Repeat previous step each day, until the end of the week.
>     </para>
>    </listitem>
>   </orderedlist>
>    </para>
>
>    <para>
>     A weekly backup need only be taken once per week, yet the same level
> of
>     protection is offered as if base backups were taken nightly.
>    </para>
>
>   </sect2>
>
> --
>   Simon Riggs
>   EnterpriseDB   http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Incrementally Updated Backup

From

Simon Riggs

Date:

20 September 2006, 10:12:54

> On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:

> Also, I'm not sold that the concept is even useful.  Apparently the idea
> is to offload the expense of taking periodic base backups from a master
> server, by instead backing up a PITR slave's fileset --- which is fine.

Good. That's the key part of the idea and its a useful one, so I was
looking to document it for 8.2

I thought of this idea separately, then, as usual, realised that this
idea has a long heritage: Change Accumulation has been in production use
with IMS for at least 20 years.

> But why in the world would you want to stop the slave to do it?  ISTM
> we would want to arrange things so that you can copy the slave's files
> while it continues replicating, just as with a standard base backup.

You can do that, of course, but my thinking was that people would regard
the technique as "unsupported", so I added a quick flag as a prototype.

On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:

> This patch has obviously been thrown together with no thought and even
> less testing.  It breaks the normal case (I think the above if-test is
> backwards), and I don't believe that it works for the advertised purpose
> either (because nothing gets done to force a checkpoint before aborting,
> thus the files on disk are not up to date with the end of WAL).

Yes, it was done very quickly and submitted to ensure it could be
considered yesterday for inclusion. It was described by me as rushed,
which it certainly was because of personal time pressure yesterday: I
thought that made it clear that discussion was needed. Heikki mentions
to me it wasn't clear, so those criticisms are accepted.

On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > +
> > + if (startupAfterRecovery)
> > + ereport(ERROR,
> > + (errmsg("recovery ends normally with startup_after_recovery=false")));
> > +
>
> I find this part of the patch a bit ugly.

Me too.

Overall, my own thoughts and Tom's and Heikki's comments indicate I
should withdraw the patch rather than fix it. Patch withdrawn.

Enclose a new doc patch to describe the capability, without s/w change.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com

Attachment

iub_doc.patch

Re: [HACKERS] Incrementally Updated Backup

From

"Jim C. Nasby"

Date:

20 September 2006, 17:20:36

On Wed, Sep 20, 2006 at 02:09:43PM +0100, Simon Riggs wrote:
> > But why in the world would you want to stop the slave to do it?  ISTM
> > we would want to arrange things so that you can copy the slave's files
> > while it continues replicating, just as with a standard base backup.
>
> You can do that, of course, but my thinking was that people would regard
> the technique as "unsupported", so I added a quick flag as a prototype.

An advantage to being able to stop the server is that you could have one
server processing backups for multiple PostgreSQL clusters by going
through them 1 (or more likely, 2, 4, etc) at a time, essentially
providing N+1 capability.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [HACKERS] Incrementally Updated Backup

From

Tom Lane

Date:

20 September 2006, 17:26:51

"Jim C. Nasby" <jim@nasby.net> writes:
> An advantage to being able to stop the server is that you could have one
> server processing backups for multiple PostgreSQL clusters by going
> through them 1 (or more likely, 2, 4, etc) at a time, essentially
> providing N+1 capability.

Why wouldn't you implement that by putting N postmasters onto the backup
server?  It'd be far more efficient than the proposed patch, which by
aborting at random points is essentially guaranteeing a whole lot of
useless re-replay of WAL whenever you restart it.

            regards, tom lane

Re: [HACKERS] Incrementally Updated Backup

From

"Jim C. Nasby"

Date:

20 September 2006, 18:44:48

On Wed, Sep 20, 2006 at 04:26:30PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
> > An advantage to being able to stop the server is that you could have one
> > server processing backups for multiple PostgreSQL clusters by going
> > through them 1 (or more likely, 2, 4, etc) at a time, essentially
> > providing N+1 capability.
>
> Why wouldn't you implement that by putting N postmasters onto the backup
> server?  It'd be far more efficient than the proposed patch, which by
> aborting at random points is essentially guaranteeing a whole lot of
> useless re-replay of WAL whenever you restart it.

My thought is that in many envoronments it would take much beefier
hardware to support N postmasters running simultaneously than to cycle
through them periodically bringing the backups up-to-date.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [HACKERS] Incrementally Updated Backup

From

Tom Lane

Date:

20 September 2006, 18:51:13

"Jim C. Nasby" <jim@nasby.net> writes:
> My thought is that in many envoronments it would take much beefier
> hardware to support N postmasters running simultaneously than to cycle
> through them periodically bringing the backups up-to-date.

How you figure that?  The cycling approach will require more total I/O
due to extra page re-reads ... particularly if it's built on a patch
like this one that abandons work-in-progress at arbitrary points.

A postmaster running WAL replay does not require all that much in the
way of CPU resources.  It is going to need I/O comparable to the gross
I/O load of its master, but cycling isn't going to reduce that at all.

            regards, tom lane

Re: [HACKERS] Incrementally Updated Backup

From

"Jim C. Nasby"

Date:

20 September 2006, 19:07:43

On Wed, Sep 20, 2006 at 05:50:48PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
> > My thought is that in many envoronments it would take much beefier
> > hardware to support N postmasters running simultaneously than to cycle
> > through them periodically bringing the backups up-to-date.
>
> How you figure that?  The cycling approach will require more total I/O
> due to extra page re-reads ... particularly if it's built on a patch
> like this one that abandons work-in-progress at arbitrary points.
>
> A postmaster running WAL replay does not require all that much in the
> way of CPU resources.  It is going to need I/O comparable to the gross
> I/O load of its master, but cycling isn't going to reduce that at all.

True, but running several dozen instances on a single machine will
require a lot more memory (or, conversely, each individual database gets
a lot less memory to use).

Of course, this is all hand-waving right now... it'd be interesting to
see which approach was actually better.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: [HACKERS] Incrementally Updated Backup

From

Csaba Nagy

Date:

21 September 2006, 06:08:02

> True, but running several dozen instances on a single machine will
> require a lot more memory (or, conversely, each individual database gets
> a lot less memory to use).
>
> Of course, this is all hand-waving right now... it'd be interesting to
> see which approach was actually better.

I'm running 4 WAL logging standby clusters on a single machine. While
the load on the master servers occasionally goes up to >60, the load on
the standby machine have never climbed above 5.

Of course when the master servers are all loaded, the standby gets
behind with the recovery... but eventually it gets up to date again.

I would be very surprised if it would get less behind if I would use it
in the 1 by 1 scenario.

Cheers,
Csaba.

Re: [HACKERS] Incrementally Updated Backup

From

Bruce Momjian

Date:

02 October 2006, 18:00:29

Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Simon Riggs wrote:
>
> > On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
>
> > Also, I'm not sold that the concept is even useful.  Apparently the idea
> > is to offload the expense of taking periodic base backups from a master
> > server, by instead backing up a PITR slave's fileset --- which is fine.
>
> Good. That's the key part of the idea and its a useful one, so I was
> looking to document it for 8.2
>
> I thought of this idea separately, then, as usual, realised that this
> idea has a long heritage: Change Accumulation has been in production use
> with IMS for at least 20 years.
>
> > But why in the world would you want to stop the slave to do it?  ISTM
> > we would want to arrange things so that you can copy the slave's files
> > while it continues replicating, just as with a standard base backup.
>
> You can do that, of course, but my thinking was that people would regard
> the technique as "unsupported", so I added a quick flag as a prototype.
>
> On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
>
> > This patch has obviously been thrown together with no thought and even
> > less testing.  It breaks the normal case (I think the above if-test is
> > backwards), and I don't believe that it works for the advertised purpose
> > either (because nothing gets done to force a checkpoint before aborting,
> > thus the files on disk are not up to date with the end of WAL).
>
> Yes, it was done very quickly and submitted to ensure it could be
> considered yesterday for inclusion. It was described by me as rushed,
> which it certainly was because of personal time pressure yesterday: I
> thought that made it clear that discussion was needed. Heikki mentions
> to me it wasn't clear, so those criticisms are accepted.
>
> On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
> > Simon Riggs wrote:
> > > +
> > > + if (startupAfterRecovery)
> > > + ereport(ERROR,
> > > + (errmsg("recovery ends normally with startup_after_recovery=false")));
> > > +
> >
> > I find this part of the patch a bit ugly.
>
> Me too.
>
>
>
> Overall, my own thoughts and Tom's and Heikki's comments indicate I
> should withdraw the patch rather than fix it. Patch withdrawn.
>
> Enclose a new doc patch to describe the capability, without s/w change.
>
> --
>   Simon Riggs
>   EnterpriseDB   http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Incrementally Updated Backup

From

Bruce Momjian

Date:

02 October 2006, 19:33:49

Documentation patch applied.  Thanks.  Your documentation changes can be
viewed in five minutes using links on the developer's page,
http://www.postgresql.org/developer/testing.

---------------------------------------------------------------------------


Simon Riggs wrote:
>
> > On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
>
> > Also, I'm not sold that the concept is even useful.  Apparently the idea
> > is to offload the expense of taking periodic base backups from a master
> > server, by instead backing up a PITR slave's fileset --- which is fine.
>
> Good. That's the key part of the idea and its a useful one, so I was
> looking to document it for 8.2
>
> I thought of this idea separately, then, as usual, realised that this
> idea has a long heritage: Change Accumulation has been in production use
> with IMS for at least 20 years.
>
> > But why in the world would you want to stop the slave to do it?  ISTM
> > we would want to arrange things so that you can copy the slave's files
> > while it continues replicating, just as with a standard base backup.
>
> You can do that, of course, but my thinking was that people would regard
> the technique as "unsupported", so I added a quick flag as a prototype.
>
> On Tue, 2006-09-19 at 12:13 -0400, Tom Lane wrote:
>
> > This patch has obviously been thrown together with no thought and even
> > less testing.  It breaks the normal case (I think the above if-test is
> > backwards), and I don't believe that it works for the advertised purpose
> > either (because nothing gets done to force a checkpoint before aborting,
> > thus the files on disk are not up to date with the end of WAL).
>
> Yes, it was done very quickly and submitted to ensure it could be
> considered yesterday for inclusion. It was described by me as rushed,
> which it certainly was because of personal time pressure yesterday: I
> thought that made it clear that discussion was needed. Heikki mentions
> to me it wasn't clear, so those criticisms are accepted.
>
> On Tue, 2006-09-19 at 16:05 +0100, Heikki Linnakangas wrote:
> > Simon Riggs wrote:
> > > +
> > > + if (startupAfterRecovery)
> > > + ereport(ERROR,
> > > + (errmsg("recovery ends normally with startup_after_recovery=false")));
> > > +
> >
> > I find this part of the patch a bit ugly.
>
> Me too.
>
>
>
> Overall, my own thoughts and Tom's and Heikki's comments indicate I
> should withdraw the patch rather than fix it. Patch withdrawn.
>
> Enclose a new doc patch to describe the capability, without s/w change.
>
> --
>   Simon Riggs
>   EnterpriseDB   http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +