>>> Questions:
>>>
>>> A. Why does the replica need 00000002.history? Shouldn't it only need
>>> 00000003.history?
>>
>> From where is the base backup taken in case of the node started at 5?
It is the same backup used to restore the master, restored to a point in
time 5 minutes earlier just to make sure the replica isn't ahead of the
master.
>
> The related source code comment says
>
> /*
> * Get any missing history files. We do this always, even when we're
> * not interested in that timeline, so that if we're promoted to
> * become the master later on, we don't select the same timeline that
> * was already used in the current master. This isn't bullet-proof -
> * you'll need some external software to manage your cluster if you
> * need to ensure that a unique timeline id is chosen in every case,
> * but let's avoid the confusion of timeline id collisions where we
> * can.
> */
> WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
So this seems to be something we're doing "just in case" which is
preventing a useful way to spin up large master/replica clusters from
PITR backup. Might this be something we want to change, and simply
error that we can't find the history file instead of FATAL?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com