Re: Race condition in recovery? - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Race condition in recovery? |
Date | |
Msg-id | CAFiTN-v+3DUbD9K5P5cK7ysjJ_EZZivLRh0tgJHyLSOecscCZA@mail.gmail.com Whole thread Raw |
In response to | Re: Race condition in recovery? (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
On Tue, May 11, 2021 at 1:42 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Mon, 10 May 2021 14:27:21 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in > > On Mon, May 10, 2021 at 2:05 PM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > > > > > I thought that the reason using receiveTLI instead of > > > recoveryTargetTLI here is that there's a case where receiveTLI is the > > > future of recoveryTarrgetTLI but I haven't successfully had such a > > > situation. If I set recovoryTargetTLI to a TLI that standby doesn't > > > know but primary knows, validateRecoveryParameters immediately > > > complains about that before reaching there. Anyway the attached > > > assumes receiveTLI may be the future of recoveryTargetTLI. > > > > If you see the note in this commit. It says without the timeline > > history file, so does it trying to say that although receiveTLI is the > > ancestor of recovoryTargetTLI, it can not detect that because of the > > absence of the TL.history file ? > > Yeah, it reads so for me and it works as described. What I don't > understand is that why the patch uses receiveTLI, not > recovoryTargetTLI to load timeline hisotry in > WaitForWALToBecomeAvailable. The only possible reason is that there > could be a case where receivedTLI is the future of recoveryTargetTLI. > However, AFAICS it's impossible for that case to happen. At > replication start, requsting TLI is that of the last checkpoint, which > is the same to recoveryTargetTLI, or anywhere in exising expectedTLEs > which must be the past of recoveryTargetTLI. That seems to be already > true at the time replication was made possible to follow a timeline > switch (abfd192b1b). > > So I was tempted to just load history for recoveryTargetTLI then > confirm that receiveTLI is in the history. Actually that change > doesn't harm any of the recovery TAP tests. It is way simpler than > the last patch. However, I'm not confident that it is right.. ;( I first thought of fixing like as you describe that instead of loading history of receiveTLI, load history for recoveryTargetTLI. But then, this commit (ee994272ca50f70b53074f0febaec97e28f83c4e) has especially used the history file of receiveTLI to solve a particular issue which I did not clearly understand. I am not sure that whether it is a good idea to directly using recoveryTargetTLI, without exactly understanding why this commit was using receiveTLI. It doesn't seem like an oversight to me, it seems intentional. Maybe Heikki can comment on this? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: