<v3-0001-pg_rewind-Fix-bug-using-cascade-standby-as-source.patch>
Hi,
Thank you for addressing this issue!
The patch needs to be rebased as it doesn’t apply on master anymore, but here are some thoughts on the patch in general without testing:
1. Regarding the approach to force a checkpoint on every restartpoint record, I wonder if it has any performance implications, since now the WAL replay will wait for the restartpoint to finish as opposed to it happening in the background.
2. This change of behaviour should be documented in [1], there’s a paragraph about restartpoints.
3. It looks like some pg_rewind code accommodating for the "restartpoint < last common checkpoint" situation could be cleaned up as well, I found this at pg_rewind.c:669 on efcbb76efe, but maybe there’s more:
if (ControlFile_source.checkPointCopy.redo < chkptredo) …
There’s also a less invasive option to fix this problem by detecting this situation from pg_rewind and simply calling checkpoint on the standby that I think is worth exploring.
Regards,
Ilya