Re: Race condition in recovery? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Race condition in recovery? |
Date | |
Msg-id | CA+Tgmob9B_fCbXZvx8a-Az76EnDEs3u7oQGdkbzVon9Z0Nc2EQ@mail.gmail.com Whole thread Raw |
In response to | Re: Race condition in recovery? (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Race condition in recovery?
Re: Race condition in recovery? |
List | pgsql-hackers |
On Tue, May 18, 2021 at 1:33 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > Yeah, it will be a fake 1-element list. But just to be clear that > 1-element can only be "ControlFile->checkPointCopy.ThisTimeLineID" and > nothing else, do you agree to this? Because we initialize > recoveryTargetTLI to this value and we might change it in > readRecoveryCommandFile() but for that, we must get the history file, > so if we are talking about the case where we don't have the history > file then "recoveryTargetTLI" will still be > "ControlFile->checkPointCopy.ThisTimeLineID". I don't think your conclusion is correct. As I understand it, you're talking about the state before ee994272ca50f70b53074f0febaec97e28f83c4e, because as of now readRecoveryCommandFile() no longer exists in master. Before that commit, StartupXLOG did this: recoveryTargetTLI = ControlFile->checkPointCopy.ThisTimeLineID; readRecoveryCommandFile(); expectedTLEs = readTimeLineHistory(recoveryTargetTLI); Now, readRecoveryCommandFile() can change recoveryTargetTLI. Before doing so, it will call existsTimeLineHistory if recovery_target_timeline was set to an integer, and findNewestTimeLine if it was set to latest. In the first case, existsTimeLineHistory() calls RestoreArchivedFile(), but that restores it using a temporary name, and KeepFileRestoredFromArchive is not called, so we might have the timeline history in RECOVERYHISTORY but that's not the filename we're actually going to try to read from inside readTimeLineHistory(). In the second case, findNewestTimeLine() will call existsTimeLineHistory() which results in the same situation. So I think when readRecoveryCommandFile() returns expectedTLI can be different but the history file can be absent since it was only ever restored under a temporary name. > Conclusion: > - I think now we agree on the point that initializing expectedTLEs > with the recovery target timeline is the right fix. > - We still have some differences of opinion about what was the > original problem in the base code which was fixed by the commit > (ee994272ca50f70b53074f0febaec97e28f83c4e). I am also still concerned about whether we understand in exactly what cases the current logic doesn't work. We seem to more or less agree on the fix, but I don't think we really understand precisely what case we are fixing. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: