Re: Race condition in recovery? - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Race condition in recovery?
Date
Msg-id CAFiTN-s=5RRrrXCjpQ4q6LS40-_2C9i3Kr=qHsHL-u7qcpLsYg@mail.gmail.com
Whole thread Raw
In response to Re: Race condition in recovery?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Jan 22, 2021 at 2:05 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jan 21, 2021 at 4:00 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > 8. Node3, get it because walsender of Node2 read it from TL13 and send
> > it and Node2 write in the new WAL file but with TL12.
> >
> > WalSndSegmentOpen()
> > {
> > /*-------
> > * When reading from a historic timeline, and there is a timeline switch
> > * within this segment, read from the WAL segment belonging to the new
> > * timeline.
> > }
> >
> > 9. Node3, now set the expectedTLEs to 12 because that is what
> > walreceiver has streamed the WAL using.
>
> This seems to be incorrect, because the comment for expectedTLEs says:
>
>  * expectedTLEs: a list of TimeLineHistoryEntries for
> recoveryTargetTLI and the timelines of
>  * its known parents, newest first (so recoveryTargetTLI is always the
>  * first list member).  Only these TLIs are expected to be seen in the WAL
>  * segments we read, and indeed only these TLIs will be considered as
>  * candidate WAL files to open at all.
>
> But in your scenario apparently we end up with a situation that
> contradicts that, because you go on to say:
>
> > 10. Node3, now recoveryTargetTLI is 13 and expectedTLEs is 12.  So
>
> As I understand, expectedTLEs should end up being a list where the
> first element is the timeline we want to end up on, and the last
> element is the timeline where we are now, and every timeline in the
> list branches off of the next timeline in the list. So here if 13
> branches of 12 then the list should be 13,12 not just 12; and if 13
> does not branch off of 12 OR if 13 branches off of 12 at an earlier
> point in the WAL stream than where we are now, then that should be an
> error that shuts down the standby, because then there is no way for
> replay to get from where it is now to the correct timeline.

Yeah, I agree with this.  So IMHO the expectedTLEs should be set based
on the recoveryTargetTLI instead of receiveTLI.  Based on the
expectedTLEs definition it can never be correct to set it based on the
receiveTLI.

Basically, the simple fix could be this.

diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index b18257c198..465bc7c929 100644

--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12533,7 +12533,8 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr,
bool randAccess,
                                                if (readFile < 0)
                                                {
                                                        if (!expectedTLEs)
-
expectedTLEs = readTimeLineHistory(receiveTLI);
+
expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
+
                                                       readFile =
XLogFileRead(readSegNo, PANIC,

                                 receiveTLI,

                                 XLOG_FROM_STREAM, false);

But I am afraid that whether this adjustment (setting based on
receiveTLI) is done based on some analysis or part of some bug fix.  I
will try to find the history of this and maybe based on that we can
make a better decision.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Is Recovery actually paused?
Next
From: Bharath Rupireddy
Date:
Subject: Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit