Timeline history files restored from archive not kept in pg_xlog, while WAL files are - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Timeline history files restored from archive not kept in pg_xlog, while WAL files are
Date
Msg-id 50DF5388.1030906@vmware.com
Whole thread Raw
List pgsql-hackers
The cascading replication patch made a change to the way WAL files
restored from archive are handled. Since then, when a WAL file is
restored from archive, it's copied under the correct filename to
pg_xlog. Aside from supporting cascading replication, this has the
advantage that if the archive subsequently goes offline, and the standby
is restarted, it can still recover back up to the point where it was
before. It also means that you can take an offline backup of the
standby, and pg_xlog includes all the files required to restore.

However, timeline history files are still not retained. When a standby
restores a timeline history file from the archive, it's restored under a
temporary filename, and discarded after it's read. That means that if
the latest checkpoint is on a WAL segment that includes an earlier
timeline switch, you again need the archive to be online to restore from
that, or you get an error like:

LOG:  unexpected timeline ID 1 in log file 0, segment 3, offset 0

This is a pre-existing issue in 9.2. In git master, it also means that
if a standby follows a master through the archive, a cascading standby
won't find the timeline history files in the 1st standby, and won't be
able to follow timeline switches.

The straightforward fix is treat timeline history files the same WAL
files, and copy them into pg_xlog when they're restored from the
archive. Patch attached, and a test script I used to test this. Barring
objections, I'll commit the patch tomorrow.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: enhanced error fields
Next
From: Heikki Linnakangas
Date:
Subject: Re: Prevent restored WAL files from being archived again Re: Unnecessary WAL archiving after failover