Home > mailing lists

Re: add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached" - Mailing list pgsql-hackers

From	Kyotaro Horiguchi
Subject	Re: add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached"
Date	October 25, 2021 00:59:30
Msg-id	20211025.095930.625109845638100737.horikyota.ntt@gmail.com Whole thread Raw
In response to	add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached" (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List	pgsql-hackers

Tree view

At Wed, 20 Oct 2021 21:35:44 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> Hi,
> 
> The  FATAL error "recovery ended before configured recovery target was
> reached" introduced by commit at [1] in PG 14 is causing the standby
> to go down after having spent a good amount of time in recovery. There
> can be cases where the arrival of required WAL (for reaching recovery
> target) from the archive location to the standby may take time and
> meanwhile the standby failing with the FATAL error isn't good.
> Instead, how about we make the standby wait for a certain amount of
> time (with a GUC) so that it can keep looking for the required WAL. If
> it gets the required WAL during the wait time, then it succeeds in
> reaching the recovery target (no FATAL error of course). If it
> doesn't, the timeout occurs and the standby fails with the FATAL
> error. The value of the new GUC can probably be set to the average
> time it takes for the WAL to reach archive location from the primary +
> from archive location to the standby, default 0 i.e. disabled.
> 
> I'm attaching a WIP patch. I've tested it on my dev system and the
> recovery regression tests are passing with it. I will provide a better
> version later, probably with a test case.
> 
> Thoughts?

It looks like starting a server in non-hot standby mode only fetching
from archive. The only difference is it doesn't have timeout.

Doesn't that cofiguration meet your requirements?

Or, if timeout matters, I agree with Jeff. Retrying in restore_command
looks fine.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

From: Tom Lane
Date: 24 October 2021, 22:58:41
Subject: Re: Assorted improvements in pg_dump

From: Kyotaro Horiguchi
Date: 25 October 2021, 01:32:52
Subject: Re: prevent immature WAL streaming

Re: add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached" - Mailing list pgsql-hackers

Previous

Next