Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAB7nPqQBbukbrWOUA8e_ZWE9FnjAH9u6LZeUYKT0+YaXUWb=Rw@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Wed, Sep 16, 2015 at 9:47 AM, Alexander Korotkov wrote:
> On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier wrote:
> OK, I get it. I tried to get rid of code duplication in the attached patch.
> There is getTimelineHistory() function which gets control file as argument
> and fetches history from appropriate path.

Cool, thanks!

> Imagine following configuration of server.
>   1
>  / \
> 2   3
>     |
>     4
>
> Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4
> is cascaded standby for node 3.
> Then node 2 and node 3 are promoted. They both got timeline number 2. Then
> node 3 is promoted and gets timeline number 3.

Typo. You mean node 4 getting timeline 3 here.

> Then we could try to rewind node 4 with node 2 as source. How pg_rewind
> could know that timeline number 2 for those nodes is not the same?
> We can only check that those timelines are forked from timeline 1 at the
> same place. But coincidence is still possible.

OK, I see your point and you are right. This additional check allows
pg_rewind to switch one timeline back and make the scan of blocks
begin at the real origin of both timelines. I had in mind the case
where you tried to actually rewind node 2 to node 3 actually which
would not be possible with your patch, and by thinking more about that
I think that it could be possible to remove the check I am listing
below and rely on the checks in the history files, basically what is
in findCommonAncestorTimeline:       if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
ControlFile_source.checkPointCopy.ThisTimeLineID)              pg_fatal("source and target cluster are on the same
 
timeline\n");
Alexander, what do you think about that? I think that we should be
able to rewind with for example node 2 as target and node 3 as source,
and vice-versa as per the example you gave even if both nodes are on
the same timeline, just that they forked at a different point. Could
you test this particular case? Using my script, simply be sure to
switch archive_mode to on/off depending on the node, aka only 3 and 4
do archiving.

+               /*
+                * Since incomplete segments are copied into next
timelines, find the
+                * lastest timeline holding required segment. Assuming
we can move
+                * in xlog both forward and backward, consider also
switching timeline
+                * back.
+                */
s/lastest/correct? "lastest" does sound very English to me even if
that's grammatically correct.

+ * Find minimum from two xlog pointers assuming invalid pointer (0) means
+ * infinity as timeline.h states.
This needs an update, now timeline.h points out correctly that this is
not 0, but InvalidXLogRecPtr.

+       /* Retreive timelines for both source and target */
s/Retreive/Retrieve.

The refactoring of getTimelineHistory as you propose looks like a good
idea to me, I tried to remove by myself the difference between source
and target in copy_fetch.c and friends but this gets uglier,
particularly because of datadir_source in copy_file_range. Not worth
it.
-- 
Michael



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: Parallel Seq Scan
Next
From: Michael Paquier
Date:
Subject: Re: Use pg_rewind when target timeline was switched