Re: BUG #14109: pg_rewind fails to update target control file in one scenario - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #14109: pg_rewind fails to update target control file in one scenario
Date
Msg-id CAB7nPqRdd8ESxKmt=VvzgU171iRrzJABnEU6mCedaG-NDXSOvw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #14109: pg_rewind fails to update target control file in one scenario  (John Lumby <johnlumby@hotmail.com>)
Responses Re: BUG #14109: pg_rewind fails to update target control file in one scenario
List pgsql-bugs
On Mon, Apr 25, 2016 at 10:48 PM, John Lumby <johnlumby@hotmail.com> wrote:
> Thanks Michael,
>
> After the pg_rewind in the scenario I described,
>
> 1) on System B (new Primary) I see
>
> Sat Apr 23 14:19:18 EDT 2016
>
> control file indicates
> last check point WAL id : 0000000C00000009000000A3
>
>  client_addr |         backend_start         |  state  | sent_location | write_location | flush_location |
replay_location
>
-------------+-------------------------------+---------+---------------+----------------+----------------+-----------------
>  10.19.0.1   | 2016-04-23 18:19:50.812509+00 | startup | 9/A30000D0    | 9/A30000D0     | 9/A30000D0     | 9/A30000D0
>
>
> 2) whereas on System A after pg_rewind  I see

(pg_rewind is a no-op here). It has done nothing to the source node.
When you ran it, it was clearly mentioned that "no rewind is needed".

> Sat Apr 23 14:19:54 EDT 2016
>
> control file indicates
> last check point WAL id : 0000000B00000009000000A3
>
>  pg_last_xlog_receive_location() , pg_last_xlog_replay_location() indicates
>
>  pg_last_xlog_receive_location | pg_last_xlog_replay_location
> -------------------------------+------------------------------
>  9/A3000000                    | 9/A30000D0
> (1 row)
>
> Note the difference in timeline

Yes, and? System A is still on its previous timeline 11, and will jump
to timeline 12 once it has connected back. That's possible since 9.3.

> and then,  as I described,   no WAL is replicated from B to A.
> Did you try this scenario yourself?

Yes.

> I hope you agree it is a bug?

No. In this case pg_rewind is a no-op: system A was shut down *before*
B was promoted, so it knows about the shutdown checkpoint record of
system A. No rewind would be needed here. One potential issue with
repetitive rewinds in such configurations is that after promotion of
system B the control file information is not up to date to the new
timeline, and pg_rewind runs and fetches the control data file of the
source node which still has the outdated timeline information. You may
want to issue a checkpoint on the source node after its promotion to
ensure that its control file is in correct shape, and pointing to the
latest timeline.

Again there is no bug here.
--
Michael

pgsql-bugs by date:

Previous
From: John Lumby
Date:
Subject: Re: BUG #14109: pg_rewind fails to update target control file in one scenario
Next
From: John Lumby
Date:
Subject: Re: BUG #14109: pg_rewind fails to update target control file in one scenario