Thread: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
"Mikael Krantz"
Date:
The following bug has been logged online:

Bug reference:      4796
Logged by:          Mikael Krantz
Email address:      mk@zigamorph.se
PostgreSQL version: 8.3.7-0lenny1
Operating system:   Linux (debian lenny)
Description:        Recovery followed by backup creates unrecoverable
WAL-file
Details:

If you perform a recovery form a file system level backup postgres will
switch to a new timeline but the first WAL-log in with the new timeline will
contain the previous timeline.

If you start a backup immediately after recovery have completed the start of
the backup will be in this bad WAL file. This makes the backup unrecoverable
as it will fail with an error similar to:

  LOG:  unexpected timeline ID 54 in log file 4, segment 236, offset 0
  LOG:  invalid checkpoint record
  PANIC:  could not locate required checkpoint record
  HINT:  If you are not restoring from a backup, try removing the file
"/var/lib/postgresql/8.3/main/backup_label".


How to reproduce:

 * restore from backup
 * SELECT pg_start_backup('label');
 * take a new backup
 * SELECT pg_stop_backup();
 * copy the relevant WAL-files
 * try to restore the backup


It is also visible in the first WAL-file of a new timeline:
# od -t x4 /var/lib/postgresql/8.3/main/pg_xlog/0000003D0000000500000001
|head -1
0000000 0002d062 0000003c 00000005 01000000

The timeline tag 0000003c is in a file named 0000003D which causes it to be
unrecoverable.

Workaround:

Wait for or force a xlog switch before pg_start_backup. Possibly a simple
fix would be to make pg_start_backup force this switch automatically.

Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
Heikki Linnakangas
Date:
Mikael Krantz wrote:
> If you perform a recovery form a file system level backup postgres will
> switch to a new timeline but the first WAL-log in with the new timeline will
> contain the previous timeline.
>
> If you start a backup immediately after recovery have completed the start of
> the backup will be in this bad WAL file. This makes the backup unrecoverable
> as it will fail with an error similar to:
>
>   LOG:  unexpected timeline ID 54 in log file 4, segment 236, offset 0
>   LOG:  invalid checkpoint record
>   PANIC:  could not locate required checkpoint record
>   HINT:  If you are not restoring from a backup, try removing the file
> "/var/lib/postgresql/8.3/main/backup_label".
>
>
> How to reproduce:
>
>  * restore from backup
>  * SELECT pg_start_backup('label');
>  * take a new backup
>  * SELECT pg_stop_backup();
>  * copy the relevant WAL-files
>  * try to restore the backup

I failed to reproduce this. Is it possible that the history file went
missing in the process? That's needed to recover WAL files from
timelines other than the latest one. You should only get that
"unexpected timeline ID" message if the history file doesn't contain a
line for that timeline ID.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
Mikael Krantz
Date:
On Wed, May 6, 2009 at 6:26 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
>> How to reproduce:
>>
>> =A0* restore from backup
>> =A0* SELECT pg_start_backup('label');
>> =A0* take a new backup
>> =A0* SELECT pg_stop_backup();
>> =A0* copy the relevant WAL-files
>> =A0* try to restore the backup
>
> I failed to reproduce this. Is it possible that the history file went
> missing in the process? That's needed to recover WAL files from timelines
> other than the latest one. You should only get that "unexpected timeline =
ID"
> message if the history file doesn't contain a line for that timeline ID.

Yes that's true. The history file is not included in the backup. It is
archived before the backup starts and is not included in the
range specified in the backup file (e.g:
0000003B00000004000000FC.00000020.backup).

Doesn't this mean that the range of log-files in the backup file is
incorrect? If the first WAL-file in the range contain records
referring to earlier timelines I will have to backup the .history-file
of that timeline in addition to the WAL-files explicitly required for
the backup. Or force a switch of log-files before starting the backup
as I'm currently doing.

The reason I stumbled onto this is that I've setup an automatic test
that sets up a warm standby, fails over, sets up a new warm server and
so on. This causes me to take new base backups very soon after a
finished recovery process.

/M

Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
Heikki Linnakangas
Date:
Mikael Krantz wrote:
> On Wed, May 6, 2009 at 6:26 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>>> How to reproduce:
>>>
>>>  * restore from backup
>>>  * SELECT pg_start_backup('label');
>>>  * take a new backup
>>>  * SELECT pg_stop_backup();
>>>  * copy the relevant WAL-files
>>>  * try to restore the backup
>> I failed to reproduce this. Is it possible that the history file went
>> missing in the process? That's needed to recover WAL files from timelines
>> other than the latest one. You should only get that "unexpected timeline ID"
>> message if the history file doesn't contain a line for that timeline ID.
>
> Yes that's true. The history file is not included in the backup. It is
> archived before the backup starts and is not included in the
> range specified in the backup file (e.g:
> 0000003B00000004000000FC.00000020.backup).
>
> Doesn't this mean that the range of log-files in the backup file is
> incorrect? If the first WAL-file in the range contain records
> referring to earlier timelines I will have to backup the .history-file
> of that timeline in addition to the WAL-files explicitly required for
> the backup. Or force a switch of log-files before starting the backup
> as I'm currently doing.

Yeah, I think you're right. If you omit pg_xlog from the base backup, as
we recommend in the manual, and clear the old files from the archive
too, then you won't have the old history file around.

I'll make pg_start_backup() to request xlog switch before the checkpoint
as you suggested. That's an easy fix that can be easily back-patched.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
Heikki Linnakangas
Date:
I wrote:
> I'll make pg_start_backup() to request xlog switch before the checkpoint
> as you suggested. That's an easy fix that can be easily back-patched.

Done. I only back-patched it down to 8.2, because earlier versions
didn't have pg_xlog_switch(). They would've required more invasive
changes which don't seem worth the effort.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
On Thu, 2009-05-07 at 12:15 +0300, Heikki Linnakangas wrote:

> Yeah, I think you're right. If you omit pg_xlog from the base backup,
> as we recommend in the manual, and clear the old files from the
> archive too, then you won't have the old history file around.

Sorry about this, but I don't agree with that fix and think it needs
more discussion, at very least. (I'm also not sure why this fix needs to
applied with such haste, even taking priority over other unapplied
patches.)

The error seems to come from deleting the history file from the archive,
rather than from the sequence of actions.

A more useful thing might be to do an xlog switch before we do the
shutdown checkpoint at end of recovery. That gives the same sequence of
actions without modifying the existing sequence of activities for
backups, which is delicate enough for me to not want to touch it.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> On Thu, 2009-05-07 at 12:15 +0300, Heikki Linnakangas wrote:
> 
>> Yeah, I think you're right. If you omit pg_xlog from the base backup,
>> as we recommend in the manual, and clear the old files from the
>> archive too, then you won't have the old history file around.
> 
> ...
> A more useful thing might be to do an xlog switch before we do the
> shutdown checkpoint at end of recovery. That gives the same sequence of
> actions without modifying the existing sequence of activities for
> backups, which is delicate enough for me to not want to touch it.

Hmm, yeah should work as well. I find the recovery sequence to be even 
more delicate, though, than pg_start_backup(). I think you'd need to 
write the XLOG switch record using the old timeline ID, as we currently 
require that the timeline changes only at a shutdown checkpoint record. 
That's not hard, but does make me a bit nervous.

The advantage of that over switching xlog segment in pg_start_backup() 
would be that you would go through fewer XLOG segments if you took 
backups often.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Thu, 2009-05-07 at 17:54 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-05-07 at 12:15 +0300, Heikki Linnakangas wrote:
> > 
> >> Yeah, I think you're right. If you omit pg_xlog from the base backup,
> >> as we recommend in the manual, and clear the old files from the
> >> archive too, then you won't have the old history file around.
> > 
> > ...
> > A more useful thing might be to do an xlog switch before we do the
> > shutdown checkpoint at end of recovery. That gives the same sequence of
> > actions without modifying the existing sequence of activities for
> > backups, which is delicate enough for me to not want to touch it.
> 
> Hmm, yeah should work as well. I find the recovery sequence to be even 
> more delicate, though, than pg_start_backup(). I think you'd need to 
> write the XLOG switch record using the old timeline ID, as we currently 
> require that the timeline changes only at a shutdown checkpoint record. 
> That's not hard, but does make me a bit nervous.
> 
> The advantage of that over switching xlog segment in pg_start_backup() 
> would be that you would go through fewer XLOG segments if you took 
> backups often.

Yes, you're right about the delicacy of all of this so both suggestions
sound kludgey - the problem is to do with timelines not with sequencing
of checkpoints and log switches. The problem is Mikael deleted the
history file and he shouldn't have done that. We need some explicit
protection for when that occurs, I feel, to avoid it breaking again in
the future with various changes we have planned.

If the history file is so important, we shouldn't only store it in the
archive. We should keep a copy locally as well and refer to it if the
archived copy is missing.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Simon Riggs wrote:
> On Thu, 2009-05-07 at 17:54 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> A more useful thing might be to do an xlog switch before we do the
>>> shutdown checkpoint at end of recovery. That gives the same sequence of
>>> actions without modifying the existing sequence of activities for
>>> backups, which is delicate enough for me to not want to touch it.
>>
>> Hmm, yeah should work as well. I find the recovery sequence to be even 
>> more delicate, though, than pg_start_backup(). I think you'd need to 
>> write the XLOG switch record using the old timeline ID, as we currently 
>> require that the timeline changes only at a shutdown checkpoint record. 
>> That's not hard, but does make me a bit nervous.
>
> Yes, you're right about the delicacy of all of this so both suggestions
> sound kludgey - the problem is to do with timelines not with sequencing
> of checkpoints and log switches. The problem is Mikael deleted the
> history file and he shouldn't have done that. 

I don't see any user error here. What he did was:

1. Restore from backup A
2. Clear old WAL archive
3. pg_start_backup() + tar all but pg_xlog + pg_stop_backup();
4. Restore new backup B

There's no history file in the archive because it was cleared in step 2. 
There's nothing wrong with that; you only need to retain WAL files from 
the point that you call pg_start_backup(). There's no history file 
either in the tar, because pg_xlog was not tarred as we recommend in the 
manual.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Thu, 2009-05-07 at 18:57 +0300, Heikki Linnakangas wrote:
> I don't see any user error here.

Just observing that the error occurs because we rely on a file being
there when we haven't even documented that it needs to be there for it
to work. File deletion with %r from the archive would not have removed
that file at that point. We should have an explicit statement about
which files can be deleted from the archive and which should not be, but
in general it is dangerous to remove files that have not been explicitly
described as removable.

Playing with the order of events seems fragile and I would prefer a more
explicit solution. Recording the timeline history permanently with each
server would be a sensible and useful thing (IIRC DB2 does this).

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Simon Riggs wrote:
> On Thu, 2009-05-07 at 18:57 +0300, Heikki Linnakangas wrote:
>> I don't see any user error here.
> 
> Just observing that the error occurs because we rely on a file being
> there when we haven't even documented that it needs to be there for it
> to work. File deletion with %r from the archive would not have removed
> that file at that point. We should have an explicit statement about
> which files can be deleted from the archive and which should not be, but
> in general it is dangerous to remove files that have not been explicitly
> described as removable.

When you create a new base backup, you shouldn't need any files archived 
before starting the backup. You might not even have had archiving 
enabled before that, or you might change archive_command to archive into 
a new location before tarting the backup.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Hi,

On Fri, May 8, 2009 at 2:42 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> When you create a new base backup, you shouldn't need any files archived
> before starting the backup.

If so, this fix is not enough, since findNewestTimeLine() is
still based on the premise that *all* the history files exist.
So, as Simon says, we should clearly say that a history file
must not be deleted from the archive. Or, we should create
a new solution.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Fujii Masao wrote:
> Hi,
> 
> On Fri, May 8, 2009 at 2:42 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> When you create a new base backup, you shouldn't need any files archived
>> before starting the backup.
> 
> If so, this fix is not enough, since findNewestTimeLine() is
> still based on the premise that *all* the history files exist.
> So, as Simon says, we should clearly say that a history file
> must not be deleted from the archive. Or, we should create
> a new solution.

The probe in findNewestTimeLine() initialized to recovery target 
timeline + 1. It doesn't require history files for any old timelines to 
be present. The purpose of findNewestTimeLine() is to ensure that if you 
e.g recover to a point in time in timeline 5, and there's already WAL 
files for timelines 6 and 7 in the archive, we pick a unique timeline id.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, 2009-05-15 at 20:11 +0900, Fujii Masao wrote:
> Hi,
> 
> On Fri, May 8, 2009 at 2:42 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > When you create a new base backup, you shouldn't need any files archived
> > before starting the backup.
> 
> If so, this fix is not enough, since findNewestTimeLine() is
> still based on the premise that *all* the history files exist.
> So, as Simon says, we should clearly say that a history file
> must not be deleted from the archive. Or, we should create
> a new solution.

I will feel safer if we keep history files in the main data directory
(somehow), not just send them to the archive.

The history files together describe the provenance of the current
database and I think it takes almost no space to record that, so it
seems like a good idea to keep them.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Hi,

On Fri, May 15, 2009 at 8:20 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> The probe in findNewestTimeLine() initialized to recovery target timeline +
> 1. It doesn't require history files for any old timelines to be present.

What if recovery_target_timeline = 'latest'? The unexpected (not latest)
recovery target timeline might be chosen when some timeline history
files don't exist.

> The
> purpose of findNewestTimeLine() is to ensure that if you e.g recover to a
> point in time in timeline 5, and there's already WAL files for timelines 6
> and 7 in the archive, we pick a unique timeline id.

When only the history file for timeline 6 is deleted, timeline 6 would be
assigned as the newest one *again* at the end of archive recovery.
Is this safe?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Fri, 2009-05-15 at 20:38 +0900, Fujii Masao wrote:

> On Fri, May 15, 2009 at 8:20 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > The probe in findNewestTimeLine() initialized to recovery target timeline +
> > 1. It doesn't require history files for any old timelines to be present.
> 
> What if recovery_target_timeline = 'latest'? The unexpected (not latest)
> recovery target timeline might be chosen when some timeline history
> files don't exist.
> 
> > The
> > purpose of findNewestTimeLine() is to ensure that if you e.g recover to a
> > point in time in timeline 5, and there's already WAL files for timelines 6
> > and 7 in the archive, we pick a unique timeline id.
> 
> When only the history file for timeline 6 is deleted, timeline 6 would be
> assigned as the newest one *again* at the end of archive recovery.
> Is this safe?

Yeh, those cases screw us up. I'm sure we can think of others, I had
time to analyse things in more detail. I'd be happier with the general
assessment that "it's unsafe to keep history files in the archive".

My suggestion is that we keep history files in a new directory under the
data directory. That way they get copied as part of the base backup,
rather than sent off to the archive where DBAs can have mad moments and
delete all, or worse, just some of them. Implementation for this
proposal is really easy and safe for where we are now: we just access
the appropriate local directory. Call it pg_history or pg_timeline etc..
Not under pg_xlog!

There is no particular reason to send history files to the archive,
since new ones are only ever generated at the end of an archive
recovery.

Now that we increment the timeline more often this is a more visible
problem than previously.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



On Fri, 2009-05-15 at 12:56 +0100, Simon Riggs wrote:

> There is no particular reason to send history files to the archive,
> since new ones are only ever generated at the end of an archive
> recovery.

It also clears up a long standing confusion between backup history files
and timeline history files. The backup history file(s) do need to go to
the archive, whereas the timeline file(s) do not.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Fujii Masao wrote:
> On Fri, May 15, 2009 at 8:20 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> The probe in findNewestTimeLine() initialized to recovery target timeline +
>> 1. It doesn't require history files for any old timelines to be present.
> 
> What if recovery_target_timeline = 'latest'? The unexpected (not latest)
> recovery target timeline might be chosen when some timeline history
> files don't exist.
> 
>> The
>> purpose of findNewestTimeLine() is to ensure that if you e.g recover to a
>> point in time in timeline 5, and there's already WAL files for timelines 6
>> and 7 in the archive, we pick a unique timeline id.
> 
> When only the history file for timeline 6 is deleted, timeline 6 would be
> assigned as the newest one *again* at the end of archive recovery.
> Is this safe?

If you delete history file and all the WAL for timeline 6, yeah, nothing 
stops it from being reused. It will work just fine, as if it never 
existed. If you still have the history file and WAL for the old timeline 
6 lying around somewhere else like an older offsite backup, it's easy 
for the administrator to get confused, but there isn't much we can do 
about that.

Simon's idea of keeping a copy of all the history files in the data 
directory wouldn't help here. In fact, I think we already never delete 
history files in the server, it's just that if you omit the pg_xlog 
directory in the base backup they won't be included. But even if they 
are included in the base backup, that wouldn't help in this scenario 
because the base backup still wouldn't contain the history files for the 
later timelines.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, 2009-05-15 at 14:56 +0300, Heikki Linnakangas wrote:

> Simon's idea of keeping a copy of all the history files in the data 
> directory wouldn't help here. In fact, I think we already never delete 
> history files in the server, it's just that if you omit the pg_xlog 
> directory in the base backup they won't be included. But even if they 
> are included in the base backup, that wouldn't help in this scenario 
> because the base backup still wouldn't contain the history files for the 
> later timelines.

You're right there. 

That still leaves the problem that we need to know the later history,
even if we don't use it.

> If you delete history file and all the WAL for timeline 6, yeah, nothing 
> stops it from being reused. It will work just fine, as if it never 
> existed. If you still have the history file and WAL for the old timeline 
> 6 lying around somewhere else like an older offsite backup, it's easy 
> for the administrator to get confused, but there isn't much we can do 
> about that.

ehem, "It will work fine" isn't correct, as Fujii-san observes.

Let's document that timeline files should not be deleted from the
archive iff there exists a base backup made during a lower numbered
timeline.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Simon Riggs wrote:
> On Fri, 2009-05-15 at 20:38 +0900, Fujii Masao wrote:
> 
>> On Fri, May 15, 2009 at 8:20 PM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> The probe in findNewestTimeLine() initialized to recovery target timeline +
>>> 1. It doesn't require history files for any old timelines to be present.
>> What if recovery_target_timeline = 'latest'? The unexpected (not latest)
>> recovery target timeline might be chosen when some timeline history
>> files don't exist.
>>
>>> The
>>> purpose of findNewestTimeLine() is to ensure that if you e.g recover to a
>>> point in time in timeline 5, and there's already WAL files for timelines 6
>>> and 7 in the archive, we pick a unique timeline id.
>> When only the history file for timeline 6 is deleted, timeline 6 would be
>> assigned as the newest one *again* at the end of archive recovery.
>> Is this safe?
> 
> Yeh, those cases screw us up. I'm sure we can think of others, I had
> time to analyse things in more detail. I'd be happier with the general
> assessment that "it's unsafe to keep history files in the archive".
> 
> My suggestion is that we keep history files in a new directory under the
> data directory. That way they get copied as part of the base backup,
> rather than sent off to the archive where DBAs can have mad moments and
> delete all, or worse, just some of them. Implementation for this
> proposal is really easy and safe for where we are now: we just access
> the appropriate local directory. Call it pg_history or pg_timeline etc..
> Not under pg_xlog!
> 
> There is no particular reason to send history files to the archive,
> since new ones are only ever generated at the end of an archive
> recovery.

Consider this:

1. Take base backup, on timeline 1. Archive to directory X
2. Disaster.
3. restore from base backup and the archive. Timeline ID is incremented 
to 2. Keep archiving to directory X.
4. Another disaster.
5. Restore again from the base backup and archive. Timeline ID is 
incremented to 3.

If the history files are not in the archive, where is the restore at 
step 5 going to get the history file for timeline 2? You certainly need 
the history files in the archive.

The history files should be considered as part of the WAL data. They 
need to be archived together with the WAL segments. When you take a new 
base backup, you no longer need the history files for old timelines, 
just like you don't need old WAL.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Mikael Krantz wrote:
> On Fri, May 15, 2009 at 2:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Let's document that timeline files should not be deleted from the
>> archive iff there exists a base backup made during a lower numbered
>> timeline.
> 
> Or made during a higher numbered timeline which happens to start in a
> WAL-file containing records from a lower numbered timeline...

That was the original issue you ran into. That has now been fixed by 
forcing an xlog switch at pg_start_backup(), so that you can't start a 
backup in a WAL file that contains records from a lower numbered timeline.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, May 15, 2009 at 2:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Let's document that timeline files should not be deleted from the
> archive iff there exists a base backup made during a lower numbered
> timeline.

Or made during a higher numbered timeline which happens to start in a
WAL-file containing records from a lower numbered timeline...

/M


Simon Riggs wrote:
> ehem, "It will work fine" isn't correct, as Fujii-san observes.

What exactly are the steps required to run into that problem? I fail to 
see what the problem is.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Simon Riggs wrote:
> On Fri, 2009-05-15 at 12:56 +0100, Simon Riggs wrote:
> 
>> There is no particular reason to send history files to the archive,
>> since new ones are only ever generated at the end of an archive
>> recovery.
> 
> It also clears up a long standing confusion between backup history files
> and timeline history files. The backup history file(s) do need to go to
> the archive, whereas the timeline file(s) do not.

(blush). Umm, and what is the distinction again? I thought they're the 
same thing..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Hi,

On Fri, May 15, 2009 at 9:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If you delete history file and all the WAL for timeline 6, yeah, nothing
>> stops it from being reused. It will work just fine, as if it never
>> existed. If you still have the history file and WAL for the old timeline
>> 6 lying around somewhere else like an older offsite backup, it's easy
>> for the administrator to get confused, but there isn't much we can do
>> about that.
>
> ehem, "It will work fine" isn't correct, as Fujii-san observes.

Yes. In the case which I described, 6 is treated as timeline newer than 7.
At least, this is against the current premise that timeline IDs must be in
increasing sequence.

> Let's document that timeline files should not be deleted from the
> archive iff there exists a base backup made during a lower numbered
> timeline.

Agreed.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Fri, 2009-05-15 at 15:41 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Fri, 2009-05-15 at 12:56 +0100, Simon Riggs wrote:
> > 
> >> There is no particular reason to send history files to the archive,
> >> since new ones are only ever generated at the end of an archive
> >> recovery.
> > 
> > It also clears up a long standing confusion between backup history files
> > and timeline history files. The backup history file(s) do need to go to
> > the archive, whereas the timeline file(s) do not.
> 
> (blush). Umm, and what is the distinction again? I thought they're the 
> same thing..

Some additional code refers to "backup history" files when it means
backup label files, which are then easily confused with the timeline
history files.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



On Fri, May 15, 2009 at 2:26 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> That was the original issue you ran into. That has now been fixed by forcing
> an xlog switch at pg_start_backup(), so that you can't start a backup in a
> WAL file that contains records from a lower numbered timeline.

Ah, sorry.

/M


On Fri, 2009-05-15 at 15:34 +0200, Mikael Krantz wrote:
> On Fri, May 15, 2009 at 2:26 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > That was the original issue you ran into. That has now been fixed by forcing
> > an xlog switch at pg_start_backup(), so that you can't start a backup in a
> > WAL file that contains records from a lower numbered timeline.
> 
> Ah, sorry.

No worries. Thanks for reporting the original bug and for staying
involved while we think of how to handle the problems it highlights.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Hi,

On Fri, May 15, 2009 at 8:56 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Fujii Masao wrote:
>>
>> On Fri, May 15, 2009 at 8:20 PM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>>
>>> The probe in findNewestTimeLine() initialized to recovery target timeline
>>> +
>>> 1. It doesn't require history files for any old timelines to be present.
>>
>> What if recovery_target_timeline = 'latest'? The unexpected (not latest)
>> recovery target timeline might be chosen when some timeline history
>> files don't exist.
>>
>>> The
>>> purpose of findNewestTimeLine() is to ensure that if you e.g recover to a
>>> point in time in timeline 5, and there's already WAL files for timelines
>>> 6
>>> and 7 in the archive, we pick a unique timeline id.
>>
>> When only the history file for timeline 6 is deleted, timeline 6 would be
>> assigned as the newest one *again* at the end of archive recovery.
>> Is this safe?
>
> If you delete history file and all the WAL for timeline 6, yeah, nothing
> stops it from being reused. It will work just fine, as if it never existed.
> If you still have the history file and WAL for the old timeline 6 lying
> around somewhere else like an older offsite backup, it's easy for the
> administrator to get confused, but there isn't much we can do about that.

OK, I probably understood your point. The timeline history files whose
timeline ID is larger than that of an oldest backup must not be deleted
from the archive. On the other hand, the smaller or equal one can be
deleted. Not all history files are necessary. So, if we don't keep older
backup, we probably can delete all files in the archive before
pg_start_backup().
Is my understanding right?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Fri, 2009-05-15 at 22:56 +0900, Fujii Masao wrote:

> OK, I probably understood your point. The timeline history files whose
> timeline ID is larger than that of an oldest backup must not be deleted
> from the archive. On the other hand, the smaller or equal one can be
> deleted. Not all history files are necessary. So, if we don't keep older
> backup, we probably can delete all files in the archive before
> pg_start_backup().
> Is my understanding right?

Heikki is right in one sense: if you do pg_start_backup() then for
*that* backup you do not need earlier files. 

However, as you have pointed out, if you have *multiple* backups then
deleting history files may cause problems with an earlier backup.

It's standard practice to have >1 backup, so there is potential for
error and minimum is we must document that. 

Rather than explaining the problem and the rules by which we can work
out exactly which history files to keep, I think it is safer to say that
we must keep all history files.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support




Simon Riggs wrote:
> On Fri, 2009-05-15 at 22:56 +0900, Fujii Masao wrote:
>
>   
>> OK, I probably understood your point. The timeline history files whose
>> timeline ID is larger than that of an oldest backup must not be deleted
>> from the archive. On the other hand, the smaller or equal one can be
>> deleted. Not all history files are necessary. So, if we don't keep older
>> backup, we probably can delete all files in the archive before
>> pg_start_backup().
>> Is my understanding right?
>>     
>
> Heikki is right in one sense: if you do pg_start_backup() then for
> *that* backup you do not need earlier files. 
>
> However, as you have pointed out, if you have *multiple* backups then
> deleting history files may cause problems with an earlier backup.
>
> It's standard practice to have >1 backup, so there is potential for
> error and minimum is we must document that. 
>
> Rather than explaining the problem and the rules by which we can work
> out exactly which history files to keep, I think it is safer to say that
> we must keep all history files.
>
>   

This whole area is unfortunately way too fragile. We need some way of 
managing these facilities that hides a lot of these details and is 
therefore less likely to produce shot feet, IMNSHO. I get very nervous 
every time I have to touch it.

cheers

andrew


Simon Riggs wrote:
> On Fri, 2009-05-15 at 22:56 +0900, Fujii Masao wrote:
> 
>> OK, I probably understood your point. The timeline history files whose
>> timeline ID is larger than that of an oldest backup must not be deleted
>> from the archive. On the other hand, the smaller or equal one can be
>> deleted. Not all history files are necessary. So, if we don't keep older
>> backup, we probably can delete all files in the archive before
>> pg_start_backup().
>> Is my understanding right?
> 
> Heikki is right in one sense: if you do pg_start_backup() then for
> *that* backup you do not need earlier files. 
> 
> However, as you have pointed out, if you have *multiple* backups then
> deleting history files may cause problems with an earlier backup.

Yes, just as deleting old WAL files.

> It's standard practice to have >1 backup, so there is potential for
> error and minimum is we must document that. 
> 
> Rather than explaining the problem and the rules by which we can work
> out exactly which history files to keep, I think it is safer to say that
> we must keep all history files.

The rule for determining which history files need to be retained is the 
same as for WAL files. Anything archived before pg_start_backup() was 
called for the oldest backup you still want to be able to restore can be 
deleted. And the alphabetical sorting property works with history files 
as well, you can call pg_xlogfile_name(pg_start_backup()) and delete 
anything < the return value from the archive.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, 2009-05-15 at 17:19 +0300, Heikki Linnakangas wrote:

> Yes, just as deleting old WAL files.

So what you're saying is because it's possible to blow your left foot
off, we're not concerned about blowing your right foot off either.

We've asked for some additional docs. What would be the objection to
that?

And you guys wonder why I get frustrated trying to fix things.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Fujii Masao wrote:
> On Fri, May 15, 2009 at 8:56 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Fujii Masao wrote:
>>> When only the history file for timeline 6 is deleted, timeline 6 would be
>>> assigned as the newest one *again* at the end of archive recovery.
>>> Is this safe?
>> If you delete history file and all the WAL for timeline 6, yeah, nothing
>> stops it from being reused. It will work just fine, as if it never existed.
>> If you still have the history file and WAL for the old timeline 6 lying
>> around somewhere else like an older offsite backup, it's easy for the
>> administrator to get confused, but there isn't much we can do about that.
> 
> OK, I probably understood your point. The timeline history files whose
> timeline ID is larger than that of an oldest backup must not be deleted
> from the archive. On the other hand, the smaller or equal one can be
> deleted. Not all history files are necessary. So, if we don't keep older
> backup, we probably can delete all files in the archive before
> pg_start_backup().
> Is my understanding right?

Yes, that's correct.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Simon Riggs wrote:
> On Fri, 2009-05-15 at 17:19 +0300, Heikki Linnakangas wrote:
> 
>> Yes, just as deleting old WAL files.
> 
> So what you're saying is because it's possible to blow your left foot
> off, we're not concerned about blowing your right foot off either.

I don't get it. What are the left and right foot in that metaphor 
referring to?

> We've asked for some additional docs. What would be the objection to
> that?

I'm certainly not opposed to improving docs. And I agree with Andrew's 
sentiment that easier-to-use tools to manage PITR archives would be very 
helpful.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, 2009-05-15 at 17:39 +0300, Heikki Linnakangas wrote:

> > We've asked for some additional docs. What would be the objection to
> > that?
> 
> I'm certainly not opposed to improving docs.

OK, so will you update the docs as requested?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



On Fri, 2009-05-15 at 10:17 -0400, Andrew Dunstan wrote:

> This whole area is unfortunately way too fragile. We need some way of 
> managing these facilities that hides a lot of these details and is 
> therefore less likely to produce shot feet, IMNSHO. I get very nervous
> every time I have to touch it.

I think it is complex, though that is because we now support a huge
number of use cases and options, to the benefit of many users. In fact,
more than I would like, but this is a group project.

Not sure why you say it's fragile; there have been very few bugs
considering the wide user base and those that have occurred have had
fixes submitted for them quickly. Yes, we require you to actually read
the docs, rather than open up psql and play, but this is business
critical stuff.

Realistically, we have more developers on this part of the code now than
any other. That's one reason for all the debate.

No problem in receiving feedback, just want to be able to understand it
sufficiently well to be able to enhance it.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support




Simon Riggs wrote:
> On Fri, 2009-05-15 at 10:17 -0400, Andrew Dunstan wrote:
>
>   
>> This whole area is unfortunately way too fragile. We need some way of 
>> managing these facilities that hides a lot of these details and is 
>> therefore less likely to produce shot feet, IMNSHO. I get very nervous
>> every time I have to touch it.
>>     
>
> I think it is complex, though that is because we now support a huge
> number of use cases and options, to the benefit of many users. In fact,
> more than I would like, but this is a group project.
>
> Not sure why you say it's fragile; there have been very few bugs
> considering the wide user base and those that have occurred have had
> fixes submitted for them quickly. Yes, we require you to actually read
> the docs, rather than open up psql and play, but this is business
> critical stuff.
>
> Realistically, we have more developers on this part of the code now than
> any other. That's one reason for all the debate.
>
> No problem in receiving feedback, just want to be able to understand it
> sufficiently well to be able to enhance it.
>
>   

I don't mean that it has bugs. I mean that it's far too easy to get it 
wrong and far too hard to get it right. I have reduced my uses to a 
couple of cases where I have worked out, with some trial and error, 
recipes that I follow. If I find these facilities complex to use, and I 
make virtually 100% of my living working with Postgres, what are more 
ordinary users going to say? That's why I think we need at the very 
least some tools for supporting the most common use cases, and hiding 
the messy details.

And no, I haven't even begun to think of what such tools might look like.

cheers

andrew




Simon Riggs wrote:
> On Fri, 2009-05-15 at 17:39 +0300, Heikki Linnakangas wrote:
> 
>>> We've asked for some additional docs. What would be the objection to
>>> that?
>> I'm certainly not opposed to improving docs.
> 
> OK, so will you update the docs as requested?

Well, we already have this in the docs:

> Each time a new timeline is created, PostgreSQL creates a "timeline history" file that shows which timeline it
branchedoff from and when. These history files are necessary to allow the system to pick the right WAL segment files
whenrecovering from an archive that contains multiple timelines. Therefore, they are archived into the WAL archive area
justlike WAL segment files. The history files are just small text files, so it's cheap and appropriate to keep them
aroundindefinitely (unlike the segment files which are large). You can, if you like, add comments to a history file to
makeyour own notes about how and why this particular timeline came to be. Such comments will be especially valuable
whenyou have a thicket of different timelines as a result of experimentation.
 

What exactly do you want to change? Patch, please.


--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


On Fri, 2009-05-15 at 11:19 -0400, Andrew Dunstan wrote:

> I don't mean that it has bugs. I mean that it's far too easy to get it 
> wrong and far too hard to get it right. I have reduced my uses to a 
> couple of cases where I have worked out, with some trial and error, 
> recipes that I follow. If I find these facilities complex to use, and I 
> make virtually 100% of my living working with Postgres, what are more 
> ordinary users going to say? That's why I think we need at the very 
> least some tools for supporting the most common use cases, and hiding 
> the messy details.

I've never had a private comment complaining about the facilities in a
general way except from you and Josh Drake, though obviously I field
bugs and questions from users frequently. I regularly get emails saying
thanks, easy to use, much easier to manage than any other form of
replication. Most frequent comment is "I was told it was really hard,
but I see now that it is easy to understand and use".

People with HA or backup experience from other databases usually have no
problem understanding the concepts or the implementation.

> And no, I haven't even begun to think of what such tools might look like.

That's OK. Wanting it to be different is the first step. I want to
improve it as well, though without removing features.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



On Fri, 2009-05-15 at 18:46 +0300, Heikki Linnakangas wrote:

> Well, we already have this in the docs:
> 
> > Each time a new timeline is created, PostgreSQL creates a "timeline
> history" file that shows which timeline it branched off from and when.
> These history files are necessary to allow the system to pick the
> right WAL segment files when recovering from an archive that contains
> multiple timelines. Therefore, they are archived into the WAL archive
> area just like WAL segment files. The history files are just small
> text files, so it's cheap and appropriate to keep them around
> indefinitely (unlike the segment files which are large). You can, if
> you like, add comments to a history file to make your own notes about
> how and why this particular timeline came to be. Such comments will be
> especially valuable when you have a thicket of different timelines as
> a result of experimentation.
> 
> What exactly do you want to change? Patch, please.

I find this exchange between us quite strange. The discussion on this
thread has been fairly clear. Fujii-san and myself have both asked for
it to be documented that history files should not be deleted.

The above section says it's "appropriate to keep them around
indefinitely".

What it doesn't say is if you delete them then you can experience
problems in certain circumstances, so we advise strongly not do this. It
would be even better if there was a section on remvong files from the
archive.

Do I really need to write a patch to say that, have you formally review
it, then change the wording to what you would have written in the first
place and then commit? Really? How many years do all of us have to work
together before we develop an efficient process for trivial changes such
as this?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Simon Riggs wrote:
> On Fri, 2009-05-15 at 18:46 +0300, Heikki Linnakangas wrote:
>> What exactly do you want to change? Patch, please.
> 
> I find this exchange between us quite strange. The discussion on this
> thread has been fairly clear. Fujii-san and myself have both asked for
> it to be documented that history files should not be deleted.
> 
> The above section says it's "appropriate to keep them around
> indefinitely".
> 
> What it doesn't say is if you delete them then you can experience
> problems in certain circumstances, so we advise strongly not do this. It
> would be even better if there was a section on remvong files from the
> archive.

Well, then again it does also say "These history files are necessary to 
allow the system to pick the right WAL segment files when recovering 
from an archive that contains multiple timelines." Necessary says "do 
not delete" to me.

> Do I really need to write a patch to say that, have you formally review
> it, then change the wording to what you would have written in the first
> place and then commit? Really?

Yes. It's not a trivial change for me, you're much better at writing 
documentation than I am. And it's still not 100% clear to me what you're 
having in mind.

> How many years do all of us have to work
> together before we develop an efficient process for trivial changes such
> as this?

It sure is a pain at times :-)

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Simon Riggs wrote:
>> Do I really need to write a patch to say that, have you formally review
>> it, then change the wording to what you would have written in the first
>> place and then commit? Really?

> Yes. It's not a trivial change for me, you're much better at writing 
> documentation than I am. And it's still not 100% clear to me what you're 
> having in mind.

I didn't read this thread earlier, but now that I have, it seems to be
making a mountain out of a molehill.  The original complaint seems to
have neglected the fact that existsTimeLineHistory() will pull history
files back from an archive.  Therefore, you can only get into trouble
if you archive the WAL segment files for a timeline and fail to keep the
associated history file in the same place.  It is entirely false that
you've got to keep the history files on the live server.

I've got no objection to clarifying the documentation's rather offhand
statement about this, but let's clarify it correctly.
        regards, tom lane


On Fri, 2009-05-15 at 18:03 -0400, Tom Lane wrote:

> I didn't read this thread earlier, but now that I have, it seems to be
> making a mountain out of a molehill.  

We've discussed a complex issue to pursue other nascent bugs. It's
confused all of us at some point, but seems we're thru that now.

Why do you think the issue on this thread has become a mountain? I don't
see anything other than a docs improvement coming out of it. (The last
thread on pg_standby *was* a mountain IMHO, but that has nothing to do
with this, other than the usual suspects being involved).

> It is entirely false that
> you've got to keep the history files on the live server.

There was a similar suggestion that was already clearly dropped, after
discussion.

I (still) think that keeping the history files that have been used to
build the current timeline would be an important documentary record for
DBAs, especially since we encourage people to add their own notes to
them. The safest place for them would be in the data directory. Keeping
them there would be a minor new feature, not any kind of bug fix.

> I've got no objection to clarifying the documentation's rather offhand
> statement about this, 

Cool

> but let's clarify it correctly.

Of course.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support