Re: "using previous checkpoint record at" maybe not the greatest idea? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: "using previous checkpoint record at" maybe not the greatest idea?
Date
Msg-id CA+TgmoZK77m+H3ZZ_9n+figFjDJyOru8xoWybNJEnHZzanZV-w@mail.gmail.com
Whole thread Raw
In response to "using previous checkpoint record at" maybe not the greatest idea?  (Andres Freund <andres@anarazel.de>)
Responses Re: "using previous checkpoint record at" maybe not the greatest idea?
List pgsql-hackers
On Mon, Feb 1, 2016 at 6:58 PM, Andres Freund <andres@anarazel.de> wrote:
> currently if, when not in standby mode, we can't read a checkpoint
> record, we automatically fall back to the previous checkpoint, and start
> replay from there.
>
> Doing so without user intervention doesn't actually seem like a good
> idea. While not super likely, it's entirely possible that doing so can
> wreck a cluster, that'd otherwise easily recoverable. Imagine e.g. a
> tablespace being dropped - going back to the previous checkpoint very
> well could lead to replay not finishing, as the directory to create
> files in doesn't even exist.
>
> As there's, afaics, really no "legitimate" reasons for needing to go
> back to the previous checkpoint I don't think we should do so in an
> automated fashion.
>
> All the cases where I could find logs containing "using previous
> checkpoint record at" were when something else had already gone pretty
> badly wrong. Now that obviously doesn't have a very large significance,
> because in the situations where it "just worked" are unlikely to be
> reported...
>
> Am I missing a reason for doing this by default?

I agree: this seems like a terrible idea.  Would we still have some
way of forcing the older checkpoint record to be used if somebody
wants to try to do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Next
From: David Steele
Date:
Subject: Re: Raising the checkpoint_timeout limit