On Thu, Sep 26, 2019 at 6:23 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Thu, Sep 26, 2019 at 5:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi,
> >
> > When we do archive recovery from the database cluster of which
> > timeline ID is more than 2 pg_wal/RECOVERYHISTORY is remained even
> > after archive recovery completed.
> >
> > The cause of this seems cbc55da556b that moved exitArchiveRecovery()
> > to before writeTimeLineHistory(). writeTimeLineHIstory() restores the
> > history file from archive directory and therefore creates
> > RECOVERYHISTORY file in pg_wal directory. We used to remove such
> > temporary file by exitArchiveRecovery() but with this commit the order
> > of calling these functions is reversed. Therefore we create
> > RECOVERYHISTORY file after exited from archive recovery mode and
> > remain it.
> >
> > To fix it I think that we can remove RECOVERYHISTORY file before the
> > history file is archived in writeTimeLineHIstory(). The commit
> > cbc55da556b is intended to minimize the window between the moment the
> > file is written and the end-of-recovery record is generated. So I
> > think it's not good to put exitArchiveRecovery() after
> > writeTimeLineHIstory().
> >
> > This issue seems to exist in all supported version as far as I read
> > the code, although I don't test all of them yet.
> >
> > I've attached the draft patch to fix this issue. Regression test might
> > be required. Feedback and suggestion are very welcome.
>
> What about moving the logic that removes RECO VERYXLOG and
> RECOVERYHISTORY from exitArchiveRecovery() and performing it
> just before/after RemoveNonParentXlogFiles()? Which looks simple.
>
Agreed. Attached the updated patch.
Regards,
--
Masahiko Sawada