Re: allow online change primary_conninfo - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: allow online change primary_conninfo |
Date | |
Msg-id | 20190919.181133.15766613.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: allow online change primary_conninfo (Sergei Kornilov <sk@zsrv.org>) |
Responses |
Re: allow online change primary_conninfo
|
List | pgsql-hackers |
Hello. I looked this and have some comments. At Wed, 28 Aug 2019 12:49:46 +0300, Sergei Kornilov <sk@zsrv.org> wrote in <34084371566985786@sas1-0a6c2e2b59d7.qloud-c.yandex.net> sk> Hello sk> sk> Updated patch attached. (also I merged into one file) sk> sk> > + <para> sk> > + WAL receiver will be restarted after <varname>primary_slot_name</varname> sk> > + was changed. sk> > </para> sk> > The sentence sounds strange. Here is a suggestion: sk> > The WAL receiver is restarted after an update of primary_slot_name (or sk> > primary_conninfo). sk> sk> Changed. - This parameter can only be set at server start. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line. I'm not sure it's good to change the context of this description. This was mentioning that changing of this parameter requires server (re)start. So if we want to be on the same context after rewriting, it would be like "This parameter can be set any time and causes WAL receiver restart with the new setting if the server is in standby mode." And If I'm not missing something, I don't find an (explict) paramter of postmaster for setting primary_conninfo. sk> > The comment at the top of the call of ProcessStartupSigHup() in sk> > HandleStartupProcInterrupts() needs to be updated as it mentions a sk> > configuration file re-read, but that's not the case anymore. sk> sk> Changed to "Process any requests or signals received recently." like in other places, e.g. syslogger sk> sk> > pendingRestartSource's name does not represent what it does, as it is sk> > only associated with the restart of a WAL receiver when in streaming sk> > state, and that's a no-op for the archive mode and the local mode. sk> sk> I renamed to pendingWalRcvRestart and replaced switch with simple condition. sk> sk> > So when shutting down the WAL receiver after a parameter change, what sk> > happens is that the startup process waits for retrieve_retry_interval sk> > before moving back to the archive mode. Only after scanning again the sk> > archives do we restart a new WAL receiver. sk> sk> As I answered earlier, here is no switch to archive or wal_retrieve_retry_interval waiting in my patch. I recheck oncurrent revision too: @@ -11787,48 +11793,6 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, if (!StandbyMode) return false; - /* - * If primary_conninfo is set, launch walreceiver to try - * to stream the missing WAL. - * - * If fetching_ckpt is true, RecPtr points to the initial - * checkpoint location. In that case, we use RedoStartLSN Mentioning code, the movement of the code block makes the surrounding swtich almost useless and the structure and the result looks somewhat strange. Couldn't we do the same thing by just skipping the wait and setting lastSourceFailed to true in the case of intentional walreceiver restart? The attached is an incomplete (fraction) patch to sketch what is in my mind. With something like this and making ProcessStartupSigHup kill walreceiver would work. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 6876537b62..61b235f8f7 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -11881,10 +11881,12 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, * obtaining the requested WAL. We're going to loop back * and retry from the archive, but if it hasn't been long * since last attempt, sleep wal_retrieve_retry_interval - * milliseconds to avoid busy-waiting. + * milliseconds to avoid busy-waiting. We don't wait if + * explicitly requested to restart. */ now = GetCurrentTimestamp(); - if (!TimestampDifferenceExceeds(last_fail_time, now, + if (!pendingWalRcvRestart && + !TimestampDifferenceExceeds(last_fail_time, now, wal_retrieve_retry_interval)) { long secs, @@ -11905,6 +11907,17 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, } last_fail_time = now; currentSource = XLOG_FROM_ARCHIVE; + + /* + * If wal receiver is requested to restart, we skip the + * next XLOG_FROM_ARCHIVE to immediately starting it. + */ + if (pendingWalRcvRestart) + { + lastSourceFailed = true; + continue; + } + break; default:
pgsql-hackers by date: