[BUG?] lag of minRecoveryPont in archive recovery - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | [BUG?] lag of minRecoveryPont in archive recovery |
Date | |
Msg-id | 20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
Responses |
Re: [BUG?] lag of minRecoveryPont in archive recovery
Re: [BUG?] lag of minRecoveryPont in archive recovery |
List | pgsql-hackers |
Hello, I have a problem with PostgreSQL 9.2 with Pacemaker. HA standby sometime failes to start under normal operation. Testing with a bare replication pair showed that the standby failes startup recovery under the operation sequence shown below. 9.3dev too, but 9.1 does not have this problem. This problem became apparent by the invalid-page check of xlog, but 9.1 also has same glitch potentially. After the investigation, the lag of minRecoveryPoint behind EndRecPtr in redo loop seems to be the cause. The lag brings about repetitive redoing of unrepeatable xlog sequences such as XLOG_HEAP2_VISIBLE -> SMGR_TRUNCATE on the same page. So I did the same aid work as xact_redo_commit_internal for smgr_redo. While doing this, I noticed that CheckRecoveryConsistency() in redo apply loop should be after redoing the record, so moved it. The patch attached (for 9.3dev) puts down this failure, also works for 9.2, and seems no harm and maybe some grace for 9.1.. What do you think about this? The details follow. reagrds, -- Kyotaro Horiguchi NTT Open Source Software Center Operation sequence to cause the problem. ======= Mst$export PGDATA=$HOME/pgdatam Mst$rm -rf $PGDATA Mst$initdb Mst$vi $PGDATA/postgresql.conf | wal_level = hot_standby | checkpoint_segments = 300 | checkpoint_timeout = 1h | archive_mode= on | archive_command = 'cp %p <ARCHIVE_DIR>/%f' | synchronous_standby_names = '*' | hot_standby = on | log_checkpoints= on Mst$vi $PGDATA/pg_hba.conf | local replication horiguti trust Mst$pg_ctl start Stb$export PGDATA=$HOME/pgdatas Stb$export PGPORT=5433 Stb$rm -rf $PGDATA Stb$pg_basebackup -X s -F p -D $PGDATA -h /tmp -p 5432 Stb$vi $PGDATA/postgresql.conf | checkpoint_segments = 1 | checkpoint_timeout = 30s Stb$vi $PGDATA/recovery.conf | standby_mode=yes | primary_conninfo='host=/tmp port=5432' | restore_command='cp <ARCHIVE_DIR>/%f%p' Stb$pg_ctl start Mst$createdb Mst$psql Mst_psql=#create table t (a int); Mst_psql=#insert into t values (1); Mst_psql=#chekcpoint; Mst_psql=#vacuum; !!! Waiting for the completion of restart point of the standby !!! Mst_psql=#delete from t; Mst_psql=#vacuum; Stb$pg_ctl stop -m i Stb$vi PGDATA/recovery.conf | restore_command='cp <ARCHIVE_DIR>/%f %p' Stb$pg_ctl start LOG: database system was interrupted while in recovery at log time 2012-12-05 09:55:18 JST HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recoverytarget. LOG: entering standby mode LOG: redo starts at 0/A020890 WARNING: page 0 of relation base/16384/16385 does not exist PANIC: WAL contains references to invalid pages LOG: startup process (PID 18422) was terminated by signal 6: Aborted LOG: terminating any other active server processes ====== diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 09d4dff..cf73b58 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5759,9 +5759,6 @@ StartupXLOG(void) /* Handle interrupt signals of startup process */ HandleStartupProcInterrupts(); - /* Allow read-only connections if we're consistent now */ - CheckRecoveryConsistency(); - /* * Have we reached our recovery target? */ @@ -5833,6 +5830,9 @@ StartupXLOG(void) /* Pop the error context stack */ error_context_stack= errcallback.previous; + /* Allow read-only connections if we're consistent now */ + CheckRecoveryConsistency(); + if (!XLogRecPtrIsInvalid(ControlFile->backupStartPoint) && XLByteLE(ControlFile->backupEndPoint,EndRecPtr)) { diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c index 993bc49..d34ab65 100644 --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -519,6 +519,12 @@ smgr_redo(XLogRecPtr lsn, XLogRecord *record) visibilitymap_truncate(rel, xlrec->blkno); FreeFakeRelcacheEntry(rel); + + /* + * Xlogs before this record is unrepeatable, so winding + * minRecoveryPoint to here. + */ + XLogFlush(lsn); } else elog(PANIC, "smgr_redo: unknown op code %u", info);
pgsql-hackers by date: