On Fri, 10 Jul 2020 10:05:52 +1000
James Sewell <james.sewell@jirotech.com> wrote:
> Hi all,
>
> I’m trying to work out a procedure for a safe zero data loss switchover
> under (high) load, which allows the old master to be reconnected without
> the use of pgrewind.
>
> Would the following be sane?
>
> - open connection to database
> - smart shutdown master
> - terminate all other connections
> - wait for shutdown (archiving will finish)
> - Identify last archived WAL file name (ensure it’s on backup server.)
> - wait till a standby has applied this WAL
> - promote that standby
> - attach old master to new master
During graceful shutdown (smart or fast), the primary is waiting for standby to
catchup with replication before the full stop. But nothing will save you if the
designated standby disconnect by itself during the shutdown procedure.
I usually use:
- shutdown the primary
- use pg_controldata to check its last redo checkpoint
and/or use pg_waldump to find its shutdown checkpoint
- check logs on standby and replication status and make sure it received the
shutdown checkpoint (pg_waldump on the standby and/or last received lsn)
- promote the standby
- setup the old primary as standby
- start the old primary as new standby.
Regards,