On Thu, Jul 29, 2021 at 1:47 AM Amul Sul <sulamul@gmail.com> wrote:
> Can we have an elog() fatal error or warning to make sure that the
> last checkpoint is still readable? Since the case where the user
> (knowingly or unknowingly) or some buggy code has removed the WAL file
> containing the last checkpoint could be possible. If it is then we
> would have a hard time finding out when we get further unexpected
> behavior due to this. Thoughts?
Sure, we could, but I don't think we should. Such crazy things can
happen any time, not just at the point where this check is happening.
It's not particularly more likely to happen here vs. any other place
where we could insert a check. Should we check everywhere, all the
time, just in case?
> > I realize that conservatism may have played a role in this code ending
> > up looking the way that it does; someone seems to have thought it
> > would be better not to rely on a new idea in all cases. From my point
> > of view, though, it's scary to have so many cases, especially cases
> > that don't seem like they should ever be reached. I think that
> > simplifying the logic here and trying to do the same things in as many
> > cases as we can will lead to better robustness. Imagine if instead of
> > all the hairy logic we have now we just replaced this whole if
> > (IsInRecovery) stanza with this:
> >
> > if (InRecovery)
> > CreateEndOfRecoveryRecord();
>
> +1, and do the checkpoint at the end unconditionally as we are doing
> for the promotion.
Yeah, that was my thought, too.
--
Robert Haas
EDB: http://www.enterprisedb.com