pg_rewind exiting with error code 1 when source and target are on the same timeline - Mailing list pgsql-bugs

From Michael Paquier
Subject pg_rewind exiting with error code 1 when source and target are on the same timeline
Date
Msg-id CAB7nPqSyLo4Jzp7-2hJh24YEU99tspkWj7vtj7NYTGcXasw2hg@mail.gmail.com
Whole thread Raw
Responses Re: pg_rewind exiting with error code 1 when source and target are on the same timeline  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-bugs
Hi all,

I have been pinged internally by a user by the fact that pg_rewind
returns 1 as exit code if the target and source nodes are on the same
timeline. Actually, in this case, it feels weird to consider that as a
failure as no rewind should be needed, the node that is behind the
other in terms of WAL replay could just be reconnected to the other
node. This is of course assuming no nodes are being "promoted" the way
for example repmgr does by deleting recovery.conf and restarting the
node, in which case both the target and source would be on the same
timeline but they forked. But I think that we should not care about
this case with pg_rewind as it has been designed to rewind nodes with
different timelines.

My point is that returning 1 in this case prevents users the
possibility to make the difference between a run of pg_rewind that is
not necessary and something that actually failed in this case, the
idea being for example to be able to roll in a base backup or take
other actions should a failure occur during a rewind or if one node
does not satisfy one of pre-processing sanity checks.

Attached is a patch aimed at changing that.
Regards,
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13688: lack of return value in r_mark_regions()
Next
From: Michael Paquier
Date:
Subject: Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.