Re: 9.2.3 crashes during archive recovery - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: 9.2.3 crashes during archive recovery
Date
Msg-id 511E3CFB.1010208@vmware.com
Whole thread Raw
In response to Re: 9.2.3 crashes during archive recovery  (Ants Aasma <ants@cybertec.at>)
Responses Re: 9.2.3 crashes during archive recovery
Re: 9.2.3 crashes during archive recovery
List pgsql-hackers
On 15.02.2013 13:05, Ants Aasma wrote:
> On Wed, Feb 13, 2013 at 10:52 PM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> The problem is that we startup Hot Standby before we hit the min
>> recovery point because that isn't recorded. For me, the thing to do is
>> to make the min recovery point == end of WAL when state is
>> DB_IN_PRODUCTION. That way we don't need to do any new writes and we
>> don't need to risk people seeing inconsistent results if they do this.
>
> While this solution would help solve my issue, it assumes that the
> correct amount of WAL files are actually there. Currently the docs for
> setting up a standby refer to "24.3.4. Recovering Using a Continuous
> Archive Backup", and that step recommends emptying the contents of
> pg_xlog. If this is chosen as the solution the docs should be adjusted
> to recommend using pg_basebackup -x for setting up the standby.

When the backup is taken using pg_start_backup or pg_basebackup,
minRecoveryPoint is set correctly anyway, and it's OK to clear out
pg_xlog. It's only if you take the backup using an atomic filesystem
snapshot, or just kill -9 the server and take a backup while it's not
running, that we have a problem. In those scenarios, you should not
clear pg_xlog.

Attached is a patch for git master. The basic idea is to split
InArchiveRecovery into two variables, InArchiveRecovery and
ArchiveRecoveryRequested. ArchiveRecoveryRequested is set when
recovery.conf exists. But if we don't know how far we need to recover,
we first perform crash recovery with InArchiveRecovery=false. When we
reach the end of WAL in pg_xlog, InArchiveRecovery is set, and we
continue with normal archive recovery.

> As a
> related point, pointing standby setup to that section has confused at
> least one of my clients. That chapter is rather scarily complicated
> compared to what's usually necessary.

Yeah, it probably could use some editing, as the underlying code has
evolved a lot since it was written. The suggestion to clear out pg_xlog
seems like an unnecessary complication. It's safe to do so, if you
restore with an archive, but unnecessary.

The "File System Level Backup" chapter
(http://www.postgresql.org/docs/devel/static/backup-file.html) probably
should mention "pg_basebackup -x", too.

Docs patches are welcome..

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Temporal features in PostgreSQL
Next
From: Heikki Linnakangas
Date:
Subject: Re: [pgsql-advocacy] Call for Google Summer of Code mentors, admins