Hello, 9.2.3 crashes during archive recovery.
This was also corrected at some point on origin/master with
another problem fixed by the commit below if my memory is
correct. But current HEAD and 9.2.3 crashes during archive
recovery (not on standby) by the 'marking deleted page visible'
problem.
http://www.postgresql.org/message-id/50C7612E.5060008@vmware.com
The script attached replays the situation.
This could be illustrated as below,
1. StartupXLOG() initializes minRecoveryPoint with ControlFile->CheckPoint.redo (virtually) at first of archive
recoveryprocess.
2. Consistency state becomes true just before redo starts.
3. Redoing certain XLOG_HEAP2_VISIBLE record causes crash because the page for visibility map has been alrady removed
by smgr_truncate() who had emitted XLOG_SMGR_TRUNCATE record after that.
> PANIC: WAL contains references to invalid pages
After all, the consistency point should be postponed until the XLOG_SMGR_TRUNCATE.
In this case, the FINAL consistency point is at the
XLOG_SMGR_TRUNCATE record, but current implemet does not record
the consistency point (checkpoint, or commit or smgr_truncate)
itself, so we cannot predict the final consistency point on
starting of recovery.
Recovery was completed successfully with the small and rough
patch below. This allows multiple consistency points but also
kills quick cessation. (And of course no check is done against
replication/promotion until now X-)
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -6029,7 +6029,9 @@ CheckRecoveryConsistency(void)
> */
> XLogCheckInvalidPages();
>
> - reachedConsistency = true;
> + //reachedConsistency = true;
> + minRecoveryPoint = InvalidXLogRecPtr;
> +
> ereport(LOG,
> (errmsg("consistent recovery state reached at %X/%X",
> (uint32) (XLogCtl->lastReplayedEndRecPtr >> 32),
On the other hand, updating control file on every commits or
smgr_truncate's should slow the transactions..
Regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
====== Replay script.
#! /bin/sh
PGDATA=$HOME/pgdata
PGARC=$HOME/pgarc
rm -rf $PGDATA/* $PGARC/*
initdb
cat >> $PGDATA/postgresql.conf <<EOF
wal_level = hot_standby
checkpoint_segments = 300
checkpoint_timeout = 1h
archive_mode = on
archive_command = 'cp %p $PGARC/%f'
EOF
pg_ctl -w start
createdb
psql <<EOF
select version();
create table t (a int);
insert into t values (5);
checkpoint;
vacuum;
delete from t;
vacuum;
EOF
pg_ctl -w stop -m i
cat >> $PGDATA/recovery.conf <<EOF
restore_command='if [ -f $PGARC/%f ]; then cp $PGARC/%f %p; fi'
EOF
postgres
=====