Re: Use pg_rewind when target timeline was switched - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Use pg_rewind when target timeline was switched
Date
Msg-id CAB7nPqStazE2SPt0jkXcByP_FVdykMNVf7A7B_uFtaXOM4SfqQ@mail.gmail.com
Whole thread Raw
In response to Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Use pg_rewind when target timeline was switched  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Tue, Sep 8, 2015 at 1:14 AM, Alexander Korotkov wrote:
> On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier wrote:
>> The code above looks correct to me when scanning the WAL history onwards
>> though, which is what is done when extracting the page map, but not
>> backwards when we try to find the last common checkpoint record. This code
>> actually fails trying to open 2/0/2 that does not exist in the promoted
>> standby's pg_xlog in my test case.
>>
>> Attached is a small script I have used to reproduce the failure.
>
>
> Right, thanks! It should be fixed in the attached version of patch.

So, instead of a code review, I have been torturing your patch and did
advanced tests on it with the attached script, that creates a cluster
as follows:
 master (5432)
  /        \
 1 (5433)   2 (5434)
 |
 3 (5435)
Once cluster is created, nodes are promoted in a certain order giving
them different timeline jump properties:
- master, stays on tli 1
- standby 1, tli 1->2
- standby 2, tli 1->3
- standby 3, tli 1->2->4
And data is inserted on each of them to make WAL fork properly.
Finally the script tries to rewind one node using another node as
source, and then tries to link this target node back to the source
node via streaming replication.

I have tested all the one-one permutations possible in the structure
above (see commented portions at the bottom of my script), and all of
them worked. I have to say that from the testing prospective this
patch looks in great shape, and will greatly improve the use cases of
pg_rewind!

I am planning to do as well a detailed code review rather soon.
Regards,
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: pgbench progress with timestamp
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write