On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Shouldn't pg_rewind ignore that failure of operation? If the file is not
>> found in source server, the file doesn't need to be copied to destination
>> server obviously. So ISTM that pg_rewind safely can skip copying that file.
>> Thought?
>
> I think that you should fail. Let's imagine that the master to be
> rewound has removed a relation file before being stopped cleanly after
> its standby has been promoted that was here at the last checkpoint
> before forking, and that the standby still has the relation file after
> promotion. You should be able to copy it to be able to replay WAL on
> it. If the standby has removed a file in the file map after taking the
> file map, I guess that the best thing to do is fail because the file
> that should be here for the rewound node cannot be fetched.
In this case, why do you think that the file should exist in the old master?
Even if it doesn't exist, ISTM that the old master can safely replay the WAL
records related to the file when it restarts. So what's the problem
if the file doesn't exist in the old master?
> Documentation should be made clearer about that with a better error
> message...
I'm wondering how we can recover (or rewind again) the old master from
that error. This also would need to be documented if we decide not to
fix any code regarding this problem...
Regards,
--
Fujii Masao