Re: when the startup process doesn't - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: when the startup process doesn't
Date
Msg-id 20210422010919.4b655015@firost
Whole thread Raw
In response to Re: when the startup process doesn't  (Andres Freund <andres@anarazel.de>)
Responses Re: when the startup process doesn't
List pgsql-hackers
On Wed, 21 Apr 2021 12:36:05 -0700
Andres Freund <andres@anarazel.de> wrote:

>  [...]  
> 
> I don't think that concern equally applies for what I am proposing
> here. For one, we already have minRecoveryPoint in ControlData, and we
> already use it for the purpose of determining where we need to recover
> to, albeit only during crash recovery. Imo that's substantially
> different from adding actual recovery progress status information to the
> control file.

Just for the record, when I was talking about updating status of the startup
in the controldata, I was thinking about setting the last known LSN replayed.
Not some kind of variable string.

> 
> I also think that it'd actually be a significant reliability improvement
> if we maintained an approximate minRecoveryPoint during normal running:
> I've seen way too many cases where WAL files were lost / removed and
> crash recovery just started up happily. Only hitting problems months
> down the line. Yes, it'd obviously not bullet proof, since we'd not want
> to add a significant stream of new fsyncs, but IME such WAL files
> lost/removed issues tend not to be about a few hundred bytes of WAL but
> many segments missing.

Maybe setting this minRecoveryPoint once per segment would be enough, near
from the beginning of the WAL. At least, the recovery process would be
forced to actually replay until the very last known segment.

Regards,



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgsql: autovacuum: handle analyze for partitioned tables
Next
From: Thomas Munro
Date:
Subject: Re: WIP: WAL prefetch (another approach)