Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAB7nPqTRJ1Gt=x_=L8LXYXaokuDYpukqBqxSn8eXNwJ+ezfD8g@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers


On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov wrote:
On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier wrote:
I am planning to do as well a detailed code review rather soon.

Good, thanks.

When testing a bit more complex structures, it occurred to me that we may want as well to use as a source node a standby. For example based on the case of my cluster above:
 master (5432)
  /              \
 1 (5433)   2 (5434)
 |
 3 (5435)
If I try for example to rebuild the cluster as follows there will be failures:
1) Rewind with source = 3, target = 2
2) Start 3 and 2
3) Shutdown 2
3) Rewind source = 2, target = 1, failure with:
source data directory must be shut down cleanly

It seems to me that we should allow a node that has been shutdowned in recovery to be used as a source for rewind as well, as follows:
-       if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
+       if (datadir_source &&
+               ControlFile_source.state != DB_SHUTDOWNED &&
+               ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
                pg_fatal("source data directory must be shut down cleanly\n");
At least your patch justifies in my eyes such a change.

 /*
+ * Find minimum from two xlog pointers assuming invalid pointer is greatest
+ * possible pointer.
+ */
+static XLogRecPtr
+xlPtrMin(XLogRecPtr a, XLogRecPtr b)
+{
+       if (XLogRecPtrIsInvalid(a))
+               return b;
+       else if (XLogRecPtrIsInvalid(b))
+               return a;
+       else
+               return Min(a, b);
+}
This is true as timeline.h tells so, still I think that it would be better to mention that this is this assumption is held in this header file, or simply that timeline history entries at the top have their end position set as InvalidXLogRecPtr which is a synonym of infinity.

The code building the target history file is a duplicate of what is done with the source. Perhaps we could refactor that as a single routine in pg_rewind.c?

Except that, the patch looks pretty neat to me. I was wondering as well: what tests did you run up to now with this patch? I am attaching an updated version of my test script I used for some more complex scenarios. Feel free to play with it.
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Counting lines correctly in psql help displays
Next
From: Haribabu Kommi
Date:
Subject: Re: Parallel Seq Scan