Re: [GENERAL] Replication slot and pg_rewind - Mailing list pgsql-general

From Bhattacharyya, Subhro
Subject Re: [GENERAL] Replication slot and pg_rewind
Date
Msg-id 42e276483a4743aa8cc6b4ee06e4645f@sap.com
Whole thread Raw
In response to Re: [GENERAL] Replication slot and pg_rewind  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [GENERAL] Replication slot and pg_rewind  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-general
Our cluster works as follows:

We do not promote the slave while the primary is up.

During an update scenario, when the master goes down the slave is promoted to master only if there is no replication
lag.

As a result, we do not have any data difference till now.

Transactions now continue on the newly promoted master thus creating a difference in data on the two nodes.

When the original master, post update comes back as slave, instead of taking a pg_basebackup, we use pg_rewind.

Our expectation is that slave will be able to sync with the new master with the help of whatever WALs are present in
thenew master due to replication slots.
 

Can pg_rewind still work without WAL archiving in this scenario.

Thanks, Subhro

-----Original Message-----
From: Michael Paquier [mailto:michael.paquier@gmail.com] 
Sent: Tuesday, June 6, 2017 8:50 AM
To: Bhattacharyya, Subhro <s.bhattacharyya@sap.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Replication slot and pg_rewind

On Tue, Jun 6, 2017 at 12:03 PM, Bhattacharyya, Subhro
<s.bhattacharyya@sap.com> wrote:
> We are using the replication slot and pg_rewind feature of postgresql 9.6
> Our cluster consists of 1 master and 1 slave node.
>
> The replication slot feature allows the master to keep as much WAL as is
> required by the slave.
>
> The pg_rewind command uses WALs to bring the slave in sync with the master.
> By using replication slots there are always enough WAL in the pg_xlog.
>
> In this case is it safe to use pg_rewind without WAL archiving?
> Can there be a situation where pg_rewind fails?

When pg_rewind runs it looks at the WAL from the last checkpoint
before WAL diverged on the *target* node, not the source. So retaining
the WAL data on the primary after the standby has been promoted makes
little sense from this point of view. Even worse, once the promoted
standby decides to recycle the past WAL segments you won't be able to
do a rewind of the previous primary because there is no way to know
what are the blocks modified on the standby since the point of
divergence.
-- 
Michael

pgsql-general by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [GENERAL] Replication slot and pg_rewind
Next
From: Michael Paquier
Date:
Subject: Re: [GENERAL] Replication slot and pg_rewind