/* * Initialize shared replayEndRecPtr, lastReplayedEndRecPtr, and * recoveryLastXTime. * * This is slightly confusing if we're starting from an online * checkpoint; we've just read and replayed the checkpoint record, but * we're going to start replay from its redo pointer, which precedes * the location of the checkpoint record itself. So even though the * last record we've replayed is indeed ReadRecPtr, we haven't * replayed all the preceding records yet. That's OK for the current * use of these variables. */ SpinLockAcquire(&xlogctl->info_lck); xlogctl->replayEndRecPtr = ReadRecPtr; xlogctl->lastReplayedEndRecPtr = EndRecPtr; xlogctl->recoveryLastXTime = 0; xlogctl->currentChunkStartTime = 0; xlogctl->recoveryPause = false; SpinLockRelease(&xlogctl->info_lck);
I think we need to fix that confusion. Your patch will do it by not setting EndRecPtr yet; that fixes the bug, but leaves those variables in a slightly strange state; I'm not sure what EndRecPtr points to in that case (0 ?), but ReadRecPtr would be set I guess.
Yes, the values were set like below.
ReadRecPtr:1/8E7F0B0
EndRecPtr:0/0
Perhaps we should reset replayEndRecPtr and lastReplayedEndRecPtr to the REDO point here, instead of ReadRecPtr/EndRecPtr.
I made another patch.
I added a ReadRecord to make sure the REDO location is present or not.
The similar process are done when we use backup_label.
Because the ReadRecord returns a record already read,
I set ReadRecPtr of the record to EndRecPtr. And also I set record->xl_prev to ReadRecPtr.
As you said, it also worked fine.
I'm not sure we should do same thing when crash recovery occurs, but now I added the process when archive recovery is needed.