Re: Fix logging for invalid recovery timeline - Mailing list pgsql-hackers

From David Steele
Subject Re: Fix logging for invalid recovery timeline
Date
Msg-id d0b27c08-d30f-4a1d-990d-5c90ca0a650a@pgbackrest.org
Whole thread Raw
In response to Re: Fix logging for invalid recovery timeline  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
List pgsql-hackers
On 12/20/24 23:28, Andrey M. Borodin wrote:
> 
>> On 20 Dec 2024, at 20:37, David Steele <david@pgbackrest.org> wrote:
>>
>> "Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off
fromthat timeline at %X/%X."
 
> 
> I think errdetai here is very hard to follow. I seem to understand what is going on after reading errmsg, but
errdetaimakes me uncertain.
 

Yeah, this one confuses users a lot. We see it mostly when a user 
accidentally promotes a standby and that standby pushes a history file 
and maybe some WAL on a new timeline, e.g. 2. The original primary 
continues to make backups on the original timeline 1. At some point a 
restore is required and Postgres by default wants to recover to the most 
recent timeline, but timeline 2 forks from timeline 1 before the latest 
backup was started so it is not accessible.

The solution is to set the target timeline to current but first the user 
needs this figure out what is going on an this error message just 
doesn't contain enough information to do that. I have some ideas on how 
to make it better but that would probably be for HEAD only.

> If we call "tliSwitchPoint(CheckPointTLI, expectedTLEs, NULL);"
> don't we risk to have again
> ereport(ERROR,
> (errmsg("requested timeline %u is not in this server's history",
> tli)));
> ?

I'm not sure what you mean. For primary backups CheckPointTLI will 
always equal ControlFile->checkPointCopy.ThisTimeLineID so that 
shouldn't be a problem. For standby backups CheckPointTLI will be <= 
ControlFile->checkPointCopy.ThisTimeLineID since CheckPointTLI 
represents the timeline at the start of the backup. If a route from that 
timeline to the current timeline can't be found then I'd certainly 
expect an error.

I'll add this patch to the January CF.

Regards,
-David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
Next
From: Robert Treat
Date:
Subject: Re: Proposal to add a new URL data type.