Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAPpHfdtLQXKVudoe3EkyL5E0adb1TUVqayz5ba+J8JNNBU7EiA@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Use pg_rewind when target timeline was switched  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier <michael.paquier@gmail.com> wrote:


On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov wrote:
On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier wrote:
I am planning to do as well a detailed code review rather soon.

Good, thanks.

When testing a bit more complex structures, it occurred to me that we may want as well to use as a source node a standby. For example based on the case of my cluster above:
 master (5432)
  /              \
 1 (5433)   2 (5434)
 |
 3 (5435)
If I try for example to rebuild the cluster as follows there will be failures:
1) Rewind with source = 3, target = 2
2) Start 3 and 2
3) Shutdown 2
3) Rewind source = 2, target = 1, failure with:
source data directory must be shut down cleanly

It seems to me that we should allow a node that has been shutdowned in recovery to be used as a source for rewind as well, as follows:
-       if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
+       if (datadir_source &&
+               ControlFile_source.state != DB_SHUTDOWNED &&
+               ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
                pg_fatal("source data directory must be shut down cleanly\n");
At least your patch justifies in my eyes such a change.
 
It's source is not required to be √ in recovery if we specify it by connection string. This is why I think DB_SHUTDOWNED_IN_RECOVERY is OK when we specify source by data directory. Included in attached patch.

 /*
+ * Find minimum from two xlog pointers assuming invalid pointer is greatest
+ * possible pointer.
+ */
+static XLogRecPtr
+xlPtrMin(XLogRecPtr a, XLogRecPtr b)
+{
+       if (XLogRecPtrIsInvalid(a))
+               return b;
+       else if (XLogRecPtrIsInvalid(b))
+               return a;
+       else
+               return Min(a, b);
+}
This is true as timeline.h tells so, still I think that it would be better to mention that this is this assumption is held in this header file, or simply that timeline history entries at the top have their end position set as InvalidXLogRecPtr which is a synonym of infinity.

Agree. Comment is adjusted.
 
The code building the target history file is a duplicate of what is done with the source. Perhaps we could refactor that as a single routine in pg_rewind.c?
 
Do you mean duplication between rewind_parseTimeLineHistory() and readTimeLineHistory(). We could put readTimeLineHistory() into common with some refactoring. This is the subject of separate patch, though. 

Except that, the patch looks pretty neat to me. I was wondering as well: what tests did you run up to now with this patch? I am attaching an updated version of my test script I used for some more complex scenarios. Feel free to play with it.

I was running manual tests like a noob :) When you attached you bash script, I've switched to it.

A also added additional check in findCommonAncestorTimeline(). Two standbys could be independently promoted and get the same new timeline ID. Now, it's checked that timelines, that we assume to be the same, have equal begins. Begins could still match by coincidence. But the same risk exists in original pg_rewind, though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Minor code improvements to create_foreignscan_plan/ExecInitForeignScan
Next
From: Alexander Korotkov
Date:
Subject: Re: Waits monitoring