Re: [patch] demote - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: [patch] demote
Date
Msg-id 20200618120202.27bedc9c@firost
Whole thread Raw
In response to Re: [patch] demote  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [patch] demote
List pgsql-hackers
On Wed, 17 Jun 2020 12:29:31 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

[...]
> > Main difference with Amul's patch is that all backends must be disconnected
> > to process with the demote. Either we wait for them to disconnect (smart)
> > or we kill them (fast). This makes life much easier from the code point of
> > view, but much more naive as well. Eg. calling "SELECT pg_demote('fast')"
> > as an admin would kill the session, with no options to wait for the action
> > to finish, as we do with pg_promote(). Keeping read only session around
> > could probably be achieved using global barrier as Amul did, but without
> > all the complexity related to WAL writes prohibition.
> >
> > There's still some questions in the current patch. As I wrote, it's an
> > humble patch, a proof of concept, a bit naive.
> >
> > Does it worth discussing it and improving it further or do I miss something
> > obvious in this design that leads to a dead end?  
> 
> I haven't looked at your code, but I think we should view the two
> efforts as complementing each other, not competing.

That was part of my feeling. I like the idea of keeping readonly backends
around. But I'm not convinced by the "ALTER SYSTEM READ ONLY" feature on its
own.

At some expense, Admin can already set the system as readonly from the
application point of view, using:

  alter system set default_transaction_read_only TO on;
  select pg_reload_conf();

Current RW xact will finish, but no other will be allowed.

> With both patches in play, a clean switchover would look like this:
> 
> - first use ALTER SYSTEM READ ONLY (or whatever we decide to call it)
> to make the primary read only, killing off write transactions
> - next use pg_ctl promote to promote the standby
> - finally use pg_ctl demote (or whatever we decide to call it) to turn
> the read-only primary into a standby of the new primary

I'm not sure how useful ALTER SYSTEM READ ONLY is, outside of the switchover
scope. This seems like it should be included in the demote process itself. If we
focus on user experience, my first original goal was:

* demote the primary
* promote a standby

Later down the path of various additional patches (keep readonly backends, add
pg_demote(), etc), we could extend the replication protocol so a switchover can
be negotiated and controlled from the nodes themselves.

> I think this would be waaaaay better than what you have to do today,
> which as I mentioned in my reply to Tom on the other thread, is very
> complicated and error-prone.

Well, I agree, the current procedure to achieve a clean switchover is
difficult from the user point of view. 

For the record, in PAF (a Pacemaker user agent) we are parsing the pg_waldump
output to check if the designated standby to promote received the shutdown
checkpoint from the primary. If it does, we accept promoting.

Manually, I usually shutdown the primary, checkpoint on standby, compare "REDO
location" from both side, then promote.

> I think with the combination of that
> patch and this one (or some successor version of each) we could get to
> a point where the tooling to do a clean switchover is relatively easy
> to write and doesn't involve having to shut down the server completely
> at any point.

That would be great yes.

> If we can do it while also preserving connections, at
> least for read-only queries, that's a better user experience, but as
> Tom pointed out over there, there are real concerns about the
> complexity of these patches, so it may be that the approach you've
> taken of just killing everything is safer and thus a superior choice
> overall.

As far as this approach doesn't close futur doors to keep readonly backends
around, that might be a good first step.

Thank you!



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Next
From: David Rowley
Date:
Subject: Re: Parallel Seq Scan vs kernel read ahead