On 15.03.2013 04:25, Michael Paquier wrote:
> Hi,
>
> When trying to *promote* a slave as master by removing recovery.conf and
> restarting node, I found an assertion failure on master branch:
> LOG: database system was shut down in recovery at 2013-03-15 10:22:27 JST
> TRAP: FailedAssertion("!(ControlFile->minRecoveryPointTLI != 1)", File:
> "xlog.c", Line: 4954)
> (gdb) bt
> #0 0x00007f95af03b2c5 in raise () from /usr/lib/libc.so.6
> #1 0x00007f95af03c748 in abort () from /usr/lib/libc.so.6
> #2 0x000000000086ce71 in ExceptionalCondition (conditionName=0x8f2af0
> "!(ControlFile->minRecoveryPointTLI != 1)", errorType=0x8f0813
> "FailedAssertion", fileName=0x8f076b "xlog.c",
> lineNumber=4954) at assert.c:54
> #3 0x00000000004fe499 in StartupXLOG () at xlog.c:4954
> #4 0x00000000006f9d34 in StartupProcessMain () at startup.c:224
> #5 0x000000000050ef92 in AuxiliaryProcessMain (argc=2,
> argv=0x7fffa6fc3d20) at bootstrap.c:423
> #6 0x00000000006f8816 in StartChildProcess (type=StartupProcess) at
> postmaster.c:4956
> #7 0x00000000006f39e9 in PostmasterMain (argc=6, argv=0x1c950a0) at
> postmaster.c:1237
> #8 0x000000000065d59b in main (argc=6, argv=0x1c950a0) at main.c:197
> Ok, this is not the cleanest way to promote a node as it doesn't do any
> safety checks relation at promotion but 9.2 and previous versions allowed
> to do that properly.
>
> The assertion has been introduced by commit 3f0ab05 in order to record
> properly minRecoveryPointTLI in control file at the end of recovery in the
> case of a crash.
> However, in the case of a slave node properly shutdown in recovery which is
> then restarted as a master, the code path of this assertion is taken.
> What do you think of the patch attached? It avoids the update of
> recoveryTargetTLI and recoveryTargetIsLatest if the node has been shutdown
> while in recovery.
> Another possibility could be to add in the assertion some conditions based
> on the state of controlFile but I think it is more consistent simply not to
> update those fields.
Simon, can you comment on this? ISTM we could just remove the assertion
and update the comment to mention that this can happen. If there is a
min recovery point, surely we always need to recover to the timeline
containing that point, so setting recoveryTargetTLI to
minRecoveryPointTLI seems sensible.
- Heikki