Re: pg_rewind failure by file deletion in source server - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: pg_rewind failure by file deletion in source server
Date
Msg-id CAHGQGwFFWLx-ZyNy+JE+R2oVKQe9FKkQC7kzapx47VMX4L6NWA@mail.gmail.com
Whole thread Raw
In response to Re: pg_rewind failure by file deletion in source server  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: pg_rewind failure by file deletion in source server
List pgsql-hackers
On Wed, Jul 1, 2015 at 2:21 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 06/29/2015 09:44 AM, Michael Paquier wrote:
>>
>> On Mon, Jun 29, 2015 at 4:55 AM, Heikki Linnakangas wrote:
>>>
>>> But we'll still need to handle the pg_xlog symlink case somehow. Perhaps
>>> it
>>> would be enough to special-case pg_xlog for now.
>>
>>
>> Well, sure, pg_rewind does not copy the soft links either way. Now it
>> would be nice to have an option to be able to recreate the soft link
>> of at least pg_xlog even if it can be scripted as well after a run.
>
>
> Hmm. I'm starting to think that pg_rewind should ignore pg_xlog entirely. In
> any non-trivial scenarios, just copying all the files from pg_xlog isn't
> enough anyway, and you need to set up a recovery.conf after running
> pg_rewind that contains a restore_command or primary_conninfo, to fetch the
> WAL. So you can argue that by not copying pg_xlog automatically, we're
> actually doing a favour to the DBA, by forcing him to set up the
> recovery.conf file correctly. Because if you just test simple scenarios
> where not much time has passed between the failover and running pg_rewind,
> it might be enough to just copy all the WAL currently in pg_xlog, but it
> would not be enough if more time had passed and not all the required WAL is
> present in pg_xlog anymore.  And by not copying the WAL, we can avoid some
> copying, as restore_command or streaming replication will only copy what's
> needed, while pg_rewind would copy all WAL it can find the target's data
> directory.
>
> pg_basebackup also doesn't include any WAL, unless you pass the --xlog
> option. It would be nice to also add an optional --xlog option to pg_rewind,
> but with pg_rewind it's possible that all the required WAL isn't present in
> the pg_xlog directory anymore, so you wouldn't always achieve the same
> effect of making the backup self-contained.
>
> So, I propose the attached. It makes pg_rewind ignore the pg_xlog directory
> in both the source and the target.

If pg_xlog is simply ignored, some old WAL files may remain in target server.
Don't these old files cause the subsequent startup of target server as new
standby to fail? That is, it's the case where the WAL file with the same name
but different content exist both in target and source. If that's harmfull,
pg_rewind also should remove the files in pg_xlog of target server.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Unneeded NULL-pointer check in FreeSpaceMapTruncateRel
Next
From: Michael Paquier
Date:
Subject: Re: pg_basebackup and replication slots