Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAPpHfdtTrza_Nw_9WEzecdc4X+B3YxjN0wA-QPy-0nMBr=Gm4Q@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Wed, Sep 16, 2015 at 7:47 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:
> A also added additional check in findCommonAncestorTimeline(). Two standbys
> could be independently promoted and get the same new timeline ID. Now, it's
> checked that timelines, that we assume to be the same, have equal begins.
> Begins could still match by coincidence. But the same risk exists in
> original pg_rewind, though.

Really? pg_rewind blocks attempts to rewind two nodes that are already
on the same timeline before checking if their timeline history map at
some point or not:
        /*
         * If both clusters are already on the same timeline, there's nothing to
         * do.
         */
        if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
ControlFile_source.checkPointCopy.ThisTimeLineID)
                pg_fatal("source and target cluster are on the same
timeline\n");
And this seems really justified to me. Imagine that you have one
master, with two standbys linked to it. If both standbys are promoted
to the same timeline, you could easily replug them to the master, but
I fail to see the point to be able to replug one promoted standby with
the other in this case: those nodes have segment and history files
that map with each other, an operator could easily mess up things in
such a configuration.

Imagine following configuration of server.
  1
 / \
2   3
    |
    4

Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4 is cascaded standby for node 3.
Then node 2 and node 3 are promoted. They both got timeline number 2. Then node 3 is promoted and gets timeline number 3.
Then we could try to rewind node 4 with node 2 as source. How pg_rewind could know that timeline number 2 for those nodes is not the same?
We can only check that those timelines are forked from timeline 1 at the same place. But coincidence is still possible.

BTW, it would be an option to generate system_identifier to each new timeline, by analogy of initdb do for the whole WAL.
Having such system_identifiers we can distinguish different timeline which have assigned same ids.
Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: row_security GUC, BYPASSRLS
Next
From: Joe Conway
Date:
Subject: Re: row_security GUC, BYPASSRLS