Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAPpHfdtLPdjWepJJ48p4oE8R90SMfYimYYOGkM9D0_870m_S3Q@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Sat, Sep 19, 2015 at 2:25 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
OK, I see your point and you are right. This additional check allows
pg_rewind to switch one timeline back and make the scan of blocks
begin at the real origin of both timelines. I had in mind the case
where you tried to actually rewind node 2 to node 3 actually which
would not be possible with your patch, and by thinking more about that
I think that it could be possible to remove the check I am listing
below and rely on the checks in the history files, basically what is
in findCommonAncestorTimeline:
        if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
            ControlFile_source.checkPointCopy.ThisTimeLineID)
                pg_fatal("source and target cluster are on the same
timeline\n");
Alexander, what do you think about that? I think that we should be
able to rewind with for example node 2 as target and node 3 as source,
and vice-versa as per the example you gave even if both nodes are on
the same timeline, just that they forked at a different point. Could
you test this particular case? Using my script, simply be sure to
switch archive_mode to on/off depending on the node, aka only 3 and 4
do archiving.

I think relying on different fork point is not safe enough. Imagine more complex case.

  1
 / \
2   3
|   |
4   5

At first, nodes 2 and 3 are promoted at the same point and they both get timeline 2.
Then nodes 4 and 5 are promoted at different points and they both get timeline 3.
Then we can try to rewind node 4 with node 5 as the source or vice versa. In this case we can't find collision of timeline 2.

The same collision could happen even when source and target are on the different timeline number. However, having the on the same timeline numbers is suspicious enough to disallow it until we have additional checks.

I could propose following plan:
  1. Commit this patch without allowing rewind when target and source are on the same timelines.
  2. Make additional checks for distinguish different timelines with the same numbers.
  3. Allow rewind when target and source are on the same timelines.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Torello Querci
Date:
Subject: Re: Database schema diff
Next
From: Alexander Korotkov
Date:
Subject: Re: Use pg_rewind when target timeline was switched