On Thu, Sep 26, 2019 at 5:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> When we do archive recovery from the database cluster of which
> timeline ID is more than 2 pg_wal/RECOVERYHISTORY is remained even
> after archive recovery completed.
>
> The cause of this seems cbc55da556b that moved exitArchiveRecovery()
> to before writeTimeLineHistory(). writeTimeLineHIstory() restores the
> history file from archive directory and therefore creates
> RECOVERYHISTORY file in pg_wal directory. We used to remove such
> temporary file by exitArchiveRecovery() but with this commit the order
> of calling these functions is reversed. Therefore we create
> RECOVERYHISTORY file after exited from archive recovery mode and
> remain it.
>
> To fix it I think that we can remove RECOVERYHISTORY file before the
> history file is archived in writeTimeLineHIstory(). The commit
> cbc55da556b is intended to minimize the window between the moment the
> file is written and the end-of-recovery record is generated. So I
> think it's not good to put exitArchiveRecovery() after
> writeTimeLineHIstory().
>
> This issue seems to exist in all supported version as far as I read
> the code, although I don't test all of them yet.
>
> I've attached the draft patch to fix this issue. Regression test might
> be required. Feedback and suggestion are very welcome.
What about moving the logic that removes RECO VERYXLOG and
RECOVERYHISTORY from exitArchiveRecovery() and performing it
just before/after RemoveNonParentXlogFiles()? Which looks simple.
Regards,
--
Fujii Masao