On Thu, Jun 18, 2020 at 8:41 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> ISTM that a clean switchover is possible without "ALTER SYSTEM READ ONLY".
> What about the following procedure?
>
> 1. Demote the primary to a standby. Then this demoted standby is read-only.
> 2. The orignal standby automatically establishes the cascading replication
> connection with the demoted standby.
> 3. Wait for all the WAL records available in the demoted standby to be streamed
> to the orignal standby.
> 4. Promote the original standby to new primary.
> 5. Change primary_conninfo in the demoted standby so that it establishes
> the replication connection with new primary.
>
> So it seems enough to implement "demote" feature for a clean switchover.
There's something to that idea. I think it somewhat depends on how
invasive the various operations are. For example, I'm not really sure
how feasible it is to demote without a full server restart that kicks
out all sessions. If that is required, it's a significant disadvantage
compared to ASRO. On the other hand, if a machine can be demoted just
by kicking out R/W sessions, as ASRO currently does, then maybe
there's not that much difference. Or maybe both designs are subject to
improvement and we can do something even less invasive...
One thing I think people are going to want to do is have the master go
read-only if it loses communication to the rest of the network, to
avoid or at least mitigate split-brain. However, such network
interruptions are often transient, so it might not be uncommon to
briefly go read-only due to a network blip, but then recover quickly
and return to a read-write state. It doesn't seem to matter much
whether that read-only state is a new kind of normal operation (like
what ASRO would do) or whether we've actually returned to a recovery
state (as demote would do) but the collateral effects of the state
change do matter.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company