Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Tom Lane wrote:
>>> ... I think it might be better to fix
>>> things so that InRecovery is maintained correctly in the bgwriter too.
>
>> We could set InRecovery=true in CreateCheckPoint if it's a startup
>> checkpoint, and reset it afterwards. I'm not 100% sure it's safe to have
>> bgwriter running with InRecovery=true at other times. Grepping for
>> InRecovery doesn't show anything that bgwriter calls, but it feels safer
>> that way.
>
> Actually, my thought was exactly that it would be better if it was set
> correctly earlier in the run --- if there ever are any places where it
> matters, this way is more likely to be right.
Well, we have RecoveryInProgress() now that answers the question "is
recovery still in progress in the system". InRecovery now means "am I a
process that's performing WAL replay?".
> (I'm not convinced that
> it doesn't matter today, anyhow --- are we sure these places are not
> called in a restartpoint?)
Hmm, good point, I didn't think of restartpoints. But skimming though
all the references to InRecovery, I can't see any.
>> Hmm, I see another small issue. We now keep track of the "minimum
>> recovery point". Whenever a data page is flushed, we set minimum
>> recovery point to the LSN of the page in XLogFlush(), instead of
>> fsyncing WAL like we do in normal operation. During the end-of-recovery
>> checkpoint, however, RecoveryInProgress() returns false, so we don't
>> update minimum recovery point in XLogFlush(). You're unlikely to be
>> bitten by that in practice; you would need to crash during the
>> end-of-recovery checkpoint, and then set the recovery target to an
>> earlier point. It should be fixed nevertheless.
>
> We would want the end-of-recovery checkpoint to act like it's not in
> recovery anymore for this purpose, no?
For the purpose of updating min recovery point, we want it to act like
it *is* still in recovery. But in the XLogFlush() call in
CreateCheckPoint(), we really want it to flush the WAL, not update min
recovery point.
A simple fix is to call UpdateMinRecoveryPoint() after the WAL replay is
finished, but before creating the checkpoint. exitArchiveRecovery()
seems like a good place.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com