Re: Why does replication need the old history file? - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Why does replication need the old history file?
Date
Msg-id CAHGQGwHLqALiiaVM1_oxt_c4yL7vHXe72FzpfC+_1Mr5rR3QFw@mail.gmail.com
Whole thread Raw
In response to Re: Why does replication need the old history file?  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Fri, Jun 12, 2015 at 5:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jun 12, 2015 at 4:56 AM, Josh Berkus <josh@agliodbs.com> wrote:
>> Hackers,
>>
>> Sequence of events:
>>
>> 1. PITR backup of server on timeline 2.
>>
>> 2. Restored the backup to a new server, new-master.
>>
>> 3. Restored the backup to another new server, new-replica.
>>
>> 4. Started and promoted new-master (now on Timeline 3).
>>
>> 5. Started new-replica, connecting over streaming to new-master.
>>
>> 6. Get error message:
>>
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,1,,2015-06-11 12:24:14
>> PDT,,0,LOG,00000,"fetching timeline history file for timeline 2 from
>> primary server",,,,,,,,,""
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,2,,2015-06-11 12:24:14
>> PDT,,0,FATAL,XX000,"could not receive timeline history file from the
>> primary server: ERROR:  could not open file
>> ""pg_xlog/00000002.history"": No such file or directory
>>
>> Questions:
>>
>> A. Why does the replica need 00000002.history?  Shouldn't it only need
>> 00000003.history?
>
> From where is the base backup taken in case of the node started at 5?

The related source code comment says
       /*        * Get any missing history files. We do this always, even when we're        * not interested in that
timeline,so that if we're promoted to        * become the master later on, we don't select the same timeline that
* was already used in the current master. This isn't bullet-proof -        * you'll need some external software to
manageyour cluster if you        * need to ensure that a unique timeline id is chosen in every case,        * but let's
avoidthe confusion of timeline id collisions where we        * can.        */
WalRcvFetchTimeLineHistoryFiles(startpointTLI,primaryTLI);
 

>
>> B. Did something change in this regard in 9.3.6, 9.3.7 or 9.3.8?  It was
>> working in our previous setup, on 9.3.5, although that could have just
>> been that the history file hadn't been removed from the backups yet.
>
> At quick glance, I can see nothing in xlog.c between those releases.

Yep, I could reproduce the "trouble" even in 9.3.5 in my laptop.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: The purpose of the core team
Next
From: David Rowley
Date:
Subject: Re: The Future of Aggregation