Re: [patch] demote - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: [patch] demote
Date
Msg-id 20200618175607.3e05e6ee@firost
Whole thread Raw
In response to Re: [patch] demote  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [patch] demote
List pgsql-hackers
On Thu, 18 Jun 2020 11:22:47 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

> On Thu, Jun 18, 2020 at 8:41 AM Fujii Masao <masao.fujii@oss.nttdata.com>
> wrote:
> > ISTM that a clean switchover is possible without "ALTER SYSTEM READ ONLY".
> > What about the following procedure?
> >
> > 1. Demote the primary to a standby. Then this demoted standby is read-only.
> > 2. The orignal standby automatically establishes the cascading replication
> >     connection with the demoted standby.
> > 3. Wait for all the WAL records available in the demoted standby to be
> > streamed to the orignal standby.
> > 4. Promote the original standby to new primary.
> > 5. Change primary_conninfo in the demoted standby so that it establishes
> >     the replication connection with new primary.
> >
> > So it seems enough to implement "demote" feature for a clean switchover.  
> 
> There's something to that idea. I think it somewhat depends on how
> invasive the various operations are. For example, I'm not really sure
> how feasible it is to demote without a full server restart that kicks
> out all sessions. If that is required, it's a significant disadvantage
> compared to ASRO. On the other hand, if a machine can be demoted just
> by kicking out R/W sessions, as ASRO currently does, then maybe
> there's not that much difference. Or maybe both designs are subject to
> improvement and we can do something even less invasive...

Considering the current demote patch improvement. I was considering to digg in
the following direction:

* add a new state in the state machine where all backends are idle
* this new state forbid any new writes, the same fashion we do on standby nodes
* this state could either wait for end of xact, or cancel/kill
  RW backends, in the same fashion current smart/fast stop do
* from this state, we might then rollback pending prepared xact, stop other
  sub-process etc (as the current patch does), and demote safely to
  PM_RECOVERY or PM_HOT_STANDBY (depending on the setup).

Is it something worth considering?
Maybe the code will be so close from ASRO, it would just be kind of a fusion of
both patch? I don't know, I didn't look at the ASRO patch yet.

> One thing I think people are going to want to do is have the master go
> read-only if it loses communication to the rest of the network, to
> avoid or at least mitigate split-brain. However, such network
> interruptions are often transient, so it might not be uncommon to
> briefly go read-only due to a network blip, but then recover quickly
> and return to a read-write state. It doesn't seem to matter much
> whether that read-only state is a new kind of normal operation (like
> what ASRO would do) or whether we've actually returned to a recovery
> state (as demote would do) but the collateral effects of the state
> change do matter.

Well, triggering such actions (demote or read only) often occurs external
decision, hopefully relying on at least some quorum and being able to escalade
to watchdog or fencing is required.

Most tools around will need to demote or fence. It seems dangerous to flip
between read only/read write on a bunch of cluster nodes. It might be quickly
messy, especially since a former primary with non replicated data could
automatically replicate from a new primary without screaming...

Regards,



pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: jsonpath versus NaN
Next
From: Tom Lane
Date:
Subject: Re: global barrier & atomics in signal handlers (Re: Atomic operations within spinlocks)