Re: pg_rewind failure by file deletion in source server - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: pg_rewind failure by file deletion in source server
Date
Msg-id CAHGQGwHyW3kiGqu1iWtEGgEXH1MuiWNJxyE3enZvMFnyROdiLA@mail.gmail.com
Whole thread Raw
In response to Re: pg_rewind failure by file deletion in source server  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: pg_rewind failure by file deletion in source server  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On Fri, Jun 12, 2015 at 4:29 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jun 12, 2015 at 3:50 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Fri, Jun 12, 2015 at 3:17 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier
>>>> <michael.paquier@gmail.com> wrote:
>>>>> On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>>> Shouldn't pg_rewind ignore that failure of operation? If the file is not
>>>>>> found in source server, the file doesn't need to be copied to destination
>>>>>> server obviously. So ISTM that pg_rewind safely can skip copying that file.
>>>>>> Thought?
>>>>>
>>>>> I think that you should fail. Let's imagine that the master to be
>>>>> rewound has removed a relation file before being stopped cleanly after
>>>>> its standby has been promoted that was here at the last checkpoint
>>>>> before forking, and that the standby still has the relation file after
>>>>> promotion. You should be able to copy it to be able to replay WAL on
>>>>> it. If the standby has removed a file in the file map after taking the
>>>>> file map, I guess that the best thing to do is fail because the file
>>>>> that should be here for the rewound node cannot be fetched.
>>>>
>>>> In this case, why do you think that the file should exist in the old master?
>>>> Even if it doesn't exist, ISTM that the old master can safely replay the WAL
>>>> records related to the file when it restarts. So what's the problem
>>>> if the file doesn't exist in the old master?
>>>
>>> Well, some user may want to rewind the master down to the point where
>>> WAL forked, and then recover it immediately when a consistent point is
>>> reached just at restart instead of replugging it into the cluster. In
>>> this case I think that you need the relation file of the dropped
>>> relation to get a consistent state. That's still cheaper than
>>> recreating a node from a fresh base backup in some cases, particularly
>>> if the last base backup taken is far in the past for this cluster.
>>
>> So it's the case where a user wants to recover old master up to the point
>> BEFORE the file in question is deleted in new master. At that point,
>> since the file must exist, pg_rewind should fail if the file cannot be copied
>> from new master. Is my understanding right?
>
> Yep. We are on the same line.
>
>> As far as I read the code of pg_rewind, ISTM that your scenario never happens.
>> Because pg_rewind sets the minimum recovery point to the latest WAL location
>> in new master, i.e., AFTER the file is deleted. So old master cannot stop
>> recovering before the file is deleted in new master. If the recovery stops
>> at that point, it fails because the minimum recovery point is not reached yet.
>>
>> IOW, after pg_rewind runs, the old master has to replay the WAL records
>> which were generated by the deletion of the file in the new master.
>> So it's okay if the old master doesn't have the file after pg_rewind runs,
>> I think.
>
> Ah, right. I withdraw, indeed what I thought can not happen:
>         /*
>          * Update control file of target. Make it ready to perform archive
>          * recovery when restarting.
>          *
>          * minRecoveryPoint is set to the current WAL insert location in the
>          * source server. Like in an online backup, it's important
> that we recover
>          * all the WAL that was generated while we copied the files over.
>          */
> So a rewound node will replay WAL up to the current insert location of
> the source, and will fail at recovery if recovery target is older than
> this insert location..
>
> You want to draft a patch? Should I?

Please feel free to try that! :)

> I think that we should have a
> test case as well in pg_rewind/t/.

Maybe.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: The Future of Aggregation
Next
From: David Rowley
Date:
Subject: Re: Aggregate Supporting Functions