Re: allow online change primary_conninfo - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: allow online change primary_conninfo
Date
Msg-id 20190730064239.GJ1742@paquier.xyz
Whole thread Raw
In response to Re: allow online change primary_conninfo  (Sergei Kornilov <sk@zsrv.org>)
Responses Re: allow online change primary_conninfo
Re: allow online change primary_conninfo
Re: allow online change primary_conninfo
List pgsql-hackers
On Mon, Jul 01, 2019 at 02:33:39PM +0300, Sergei Kornilov wrote:
> Updated version attached. Merge conflict was about tests count in
> 001_stream_rep.pl. Nothing else was changed. My approach can be
> still incorrect, any redesign ideas are welcome. Thanks in advance!

It has been some time, and I am finally catching up with this patch.

+         <para>
+          WAL receiver will be restarted after <varname>primary_slot_name</varname>
+          was changed.
          </para>
The sentence sounds strange.  Here is a suggestion:
The WAL receiver is restarted after an update of primary_slot_name (or
primary_conninfo).

The comment at the top of the call of ProcessStartupSigHup() in
HandleStartupProcInterrupts() needs to be updated as it mentions a
configuration file re-read, but that's not the case anymore.

pendingRestartSource's name does not represent what it does, as it is
only associated with the restart of a WAL receiver when in streaming
state, and that's a no-op for the archive mode and the local mode.

So, the patch splits the steps taken when checking for a WAL source by
adding an extra step after the failure handling that you are calling
the restart step.  When a failure happens for the stream mode
(shutdown of WAL receiver, promotion. etc), there is a switch to the
archive mode, and nothing is changed in this case in your patch.  So
when shutting down the WAL receiver after a parameter change, what
happens is that the startup process waits for retrieve_retry_interval
before moving back to the archive mode.  Only after scanning again the
archives do we restart a new WAL receiver.  However, if the restart of
the WAL receiver is planned because of an update of primary_conninfo
(or slot), shouldn't the follow-up mode be XLOG_FROM_STREAM without
waiting for wal_retrieve_retry_interval ms for extra WAL to become
available?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: tap tests driving the database via psql
Next
From: Thomas Munro
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs