Re: pg_rewind: warn when checkpoint hasn't happened after promotion - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: pg_rewind: warn when checkpoint hasn't happened after promotion
Date
Msg-id 20220706.113842.34994619007220403.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: pg_rewind: warn when checkpoint hasn't happened after promotion  (James Coleman <jtc331@gmail.com>)
Responses Re: pg_rewind: warn when checkpoint hasn't happened after promotion
List pgsql-hackers
At Tue, 5 Jul 2022 14:46:13 -0400, James Coleman <jtc331@gmail.com> wrote in 
> On Tue, Jul 5, 2022 at 2:39 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Sat, Jun 4, 2022 at 8:59 AM James Coleman <jtc331@gmail.com> wrote:
> > > A quick background refresher: after promoting a standby rewinding the
> > > former primary requires that a checkpoint have been completed on the
> > > new primary after promotion. This is correctly documented. However
> > > pg_rewind incorrectly reports to the user that a rewind isn't
> > > necessary because the source and target are on the same timeline.
> >
> > Is there anything intrinsic to the mechanism of operation of pg_rewind
> > that requires a timeline change, or could we just rewind within the
> > same timeline to an earlier LSN? In other words, maybe we could just
> > remove this limitation of pg_rewind, and then perhaps it wouldn't be
> > necessary to determine what the new timeline is.
>
> I think (someone can correct me if I'm wrong) that in theory the
> mechanisms would support the source and target being on the same
> timeline, but in practice that presents problems since you'd not have
> an LSN you could detect as the divergence point. If we allowed passing
> "rewind to" point LSN value, then that (again, as far as I understand
> it) would work, but it's a different use case. Specifically I wouldn't
> want that option to need to be used for this particular case since in
> my example there is in fact a real divergence point that we should be
> detecting automatically.

The point of pg_rewind is finding diverging point then finding all
blocks modified in the dead history (from the diverging point) and
"replace" them with those of the live history. In that sense, to be
exact, pg_rewind does not "rewind" a cluster.  If no diverging point,
the last LSN of the cluster getting behind (as target cluster?) is
that and just no need to replace a block at all because no WAL exists
(on the cluster being behind) after the last LSN.

The issue here is pg_rewind looks into control file to determine the
soruce timeline, because the control file is not updated until the
first checkpoint ends after promotion finishes, even though file
blocks are already diverged.

Even in that case history file for the new timeline is already
created, so searching for the latest history file works.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning
Next
From: Masahiko Sawada
Date:
Subject: Re: Issue with pg_stat_subscription_stats