Thread: Incrementally Updated Backups and restartpoints

Incrementally Updated Backups and restartpoints

From

Heikki Linnakangas

Date:

13 January 2010, 07:36:20

Our documentation suggests that you can take a base backup of a warm
standby server while it's running:

> If we take a backup of the standby server's data directory while it is processing logs shipped from the primary, we
willbe able to reload that data and restart the standby's recovery process from the last restart point. We no longer
needto keep WAL files from before the restart point. If we need to recover, it will be faster to recover from the
incrementallyupdated backup than from the original base backup. 
 

That doesn't seem safe. If the server makes a new restartpoint while the
backup is running, and pg_control is backed up after the new
restartpoint is made, recovery will restart from the new restartpoint.
That is wrong; recovery needs to restart at the restartpoint that was
most recent when the backup started. This is basically the same issue we
have solved in master with the backup_label file.

I wonder if it would be enough to document that pg_control must be
backed up first?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Incrementally Updated Backups and restartpoints

From

Fujii Masao

Date:

13 January 2010, 07:57:51

On Wed, Jan 13, 2010 at 8:36 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Our documentation suggests that you can take a base backup of a warm
> standby server while it's running:
>
>> If we take a backup of the standby server's data directory while it is processing logs shipped from the primary, we
willbe able to reload that data and restart the standby's recovery process from the last restart point. We no longer
needto keep WAL files from before the restart point. If we need to recover, it will be faster to recover from the
incrementallyupdated backup than from the original base backup. 
>
> That doesn't seem safe. If the server makes a new restartpoint while the
> backup is running, and pg_control is backed up after the new
> restartpoint is made, recovery will restart from the new restartpoint.
> That is wrong; recovery needs to restart at the restartpoint that was
> most recent when the backup started. This is basically the same issue we
> have solved in master with the backup_label file.

Right.

> I wonder if it would be enough to document that pg_control must be
> backed up first?

Probably No. The archive recovery from such base backup would always
fail at the end of recovery because there is no backup-end record,
i.e., pg_stop_backup() is not executed in that case.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Incrementally Updated Backups and restartpoints

From

Heikki Linnakangas

Date:

13 January 2010, 08:34:23

Fujii Masao wrote:
> On Wed, Jan 13, 2010 at 8:36 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> I wonder if it would be enough to document that pg_control must be
>> backed up first?
> 
> Probably No. The archive recovery from such base backup would always
> fail at the end of recovery because there is no backup-end record,
> i.e., pg_stop_backup() is not executed in that case.

No, that's not an issue. We only wait for the backup-end record if we
haven't seen yet since we started recovery from the base backup.
Assuming the standby had reached that point already before the new
backup from the standby started, backupStartLoc is zero in the control file.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Incrementally Updated Backups and restartpoints

From

Fujii Masao

Date:

13 January 2010, 19:05:02

On Wed, Jan 13, 2010 at 9:34 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> No, that's not an issue. We only wait for the backup-end record if we
> haven't seen yet since we started recovery from the base backup.
> Assuming the standby had reached that point already before the new
> backup from the standby started, backupStartLoc is zero in the control file.

OK. That assumption should be documented?

And, when we start an archive recovery from the backup from the standby,
we seem to reach a safe starting point before database has actually become
consistent. It's because backupStartLoc is zero. Isn't this an issue?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Incrementally Updated Backups and restartpoints

From

Fujii Masao

Date:

04 March 2010, 08:01:34

Hi,

I thought of this issue again since the related question arrived.
http://archives.postgresql.org/pgsql-admin/2010-03/msg00036.php

On Thu, Jan 14, 2010 at 7:13 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Jan 13, 2010 at 9:34 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> No, that's not an issue. We only wait for the backup-end record if we
>> haven't seen yet since we started recovery from the base backup.
>> Assuming the standby had reached that point already before the new
>> backup from the standby started, backupStartLoc is zero in the control file.
>
> OK. That assumption should be documented?

This comment is meaningless. Sorry for noise.

> And, when we start an archive recovery from the backup from the standby,
> we seem to reach a safe starting point before database has actually become
> consistent. It's because backupStartLoc is zero. Isn't this an issue?

This issue seems to still happen. So should this be fixed for 9.0?
Or only writing a note in document is enough for 9.0? I'm leaning
towards the latter.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Incrementally Updated Backups and restartpoints

From

Fujii Masao

Date:

26 March 2010, 10:22:02

On Thu, Mar 4, 2010 at 9:00 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> And, when we start an archive recovery from the backup from the standby,
>> we seem to reach a safe starting point before database has actually become
>> consistent. It's because backupStartLoc is zero. Isn't this an issue?
>
> This issue seems to still happen. So should this be fixed for 9.0?
> Or only writing a note in document is enough for 9.0? I'm leaning
> towards the latter.

I'm thinking of adding something like the following to the section
"25.6. Incrementally Updated Backups". Thought?


    The pg_control file must be backed up first.
    This avoids the problem that we might fail to restore a consistent
    database state because recovery starts from the later restart point
    than the start of the backup.

    When recovering from the incrementally updated backup, the server
    can begin accepting connections and complete the recovery successfully
    before the database has become consistent. To avoid these problems,
    you must check whether the database has been consistent by comparing
    the progress of the recovery with the backup ending WAL location
    before your users try to connect to the server and when archive
    recovery ends. So, in advance, the backup ending WAL location must
    be taken by calling the pg_last_xlog_replay_location function at the
    end of the backup. The progress of the recovery is also taken from
    the pg_last_xlog_replay_location function.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center