Re: [patch] demote - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: [patch] demote |
Date | |
Msg-id | 20200626.161438.1461210156318414119.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: [patch] demote (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>) |
Responses |
Re: [patch] demote
Re: [patch] demote |
List | pgsql-hackers |
Hello. At Thu, 25 Jun 2020 19:27:54 +0200, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote in > Here is a summary of my work during the last few days on this demote approach. > > Please, find in attachment v2-0001-Demote-PoC.patch and the comments in the > commit message and as FIXME in code. > > The patch is not finished or bug-free yet, I'm still not very happy with the > coding style, it probably lack some more code documentation, but a lot has > changed since v1. It's still a PoC to push the discussion a bit further after > being myself silent for some days. > > The patch is currently relying on a demote checkpoint. I understand a forced > checkpoint overhead can be massive and cause major wait/downtime. But I keep > this for a later step. Maybe we should be able to cancel a running checkpoint? > Or leave it to its synching work but discard the result without wirting it to > XLog? If we are going to dive so close to server shutdown, we can just utilize the restart-after-crash path, which we can assume to work reliably. The attached is a quite rough sketch, hijacking smart shutdown path for a convenience, of that but seems working. "pg_ctl -m s -W stop" lets server demote. > I hadn't time to investigate Robert's concern about shared memory for snapshot > during recovery. The patch does all required clenaup of resources including shared memory, I believe. It's enough if we don't need to keep any resources alive? > The patch doesn't deal with prepared xact yet. Testing "start->demote->promote" > raise an assert if some prepared xact exist. I suppose I will rollback them > during demote in next patch version. > > I'm not sure how to divide this patch in multiple small independent steps. I > suppose I can split it like: > > 1. add demote checkpoint > 2. support demote: mostly postmaster, startup/xlog and checkpointer related > code > 3. cli using pg_ctl demote > > ...But I'm not sure it worth it. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index b4d475bb0b..a4adf3e587 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2752,6 +2752,7 @@ SIGHUP_handler(SIGNAL_ARGS) /* * pmdie -- signal handler for processing various postmaster signals. */ +static bool demoting = false; static void pmdie(SIGNAL_ARGS) { @@ -2774,59 +2775,17 @@ pmdie(SIGNAL_ARGS) case SIGTERM: /* - * Smart Shutdown: + * XXX: Hijacked as DEMOTE * - * Wait for children to end their work, then shut down. + * Runs fast shutdown, then restart as standby */ if (Shutdown >= SmartShutdown) break; Shutdown = SmartShutdown; ereport(LOG, - (errmsg("received smart shutdown request"))); - - /* Report status */ - AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING); -#ifdef USE_SYSTEMD - sd_notify(0, "STOPPING=1"); -#endif - - if (pmState == PM_RUN || pmState == PM_RECOVERY || - pmState == PM_HOT_STANDBY || pmState == PM_STARTUP) - { - /* autovac workers are told to shut down immediately */ - /* and bgworkers too; does this need tweaking? */ - SignalSomeChildren(SIGTERM, - BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER); - /* and the autovac launcher too */ - if (AutoVacPID != 0) - signal_child(AutoVacPID, SIGTERM); - /* and the bgwriter too */ - if (BgWriterPID != 0) - signal_child(BgWriterPID, SIGTERM); - /* and the walwriter too */ - if (WalWriterPID != 0) - signal_child(WalWriterPID, SIGTERM); - - /* - * If we're in recovery, we can't kill the startup process - * right away, because at present doing so does not release - * its locks. We might want to change this in a future - * release. For the time being, the PM_WAIT_READONLY state - * indicates that we're waiting for the regular (read only) - * backends to die off; once they do, we'll kill the startup - * and walreceiver processes. - */ - pmState = (pmState == PM_RUN) ? - PM_WAIT_BACKUP : PM_WAIT_READONLY; - } - - /* - * Now wait for online backup mode to end and backends to exit. If - * that is already the case, PostmasterStateMachine will take the - * next step. - */ - PostmasterStateMachine(); - break; + (errmsg("received demote request"))); + demoting = true; + /* FALL THROUGH */ case SIGINT: @@ -2839,8 +2798,10 @@ pmdie(SIGNAL_ARGS) if (Shutdown >= FastShutdown) break; Shutdown = FastShutdown; - ereport(LOG, - (errmsg("received fast shutdown request"))); + + if (!demoting) + ereport(LOG, + (errmsg("received fast shutdown request"))); /* Report status */ AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING); @@ -2887,6 +2848,13 @@ pmdie(SIGNAL_ARGS) pmState = PM_WAIT_BACKENDS; } + /* create standby signal file */ + { + FILE *standby_file = AllocateFile(STANDBY_SIGNAL_FILE, "w"); + + Assert (standby_file && !FreeFile(standby_file)); + } + /* * Now wait for backends to exit. If there are none, * PostmasterStateMachine will take the next step. @@ -3958,7 +3926,7 @@ PostmasterStateMachine(void) * EOF on its input pipe, which happens when there are no more upstream * processes. */ - if (Shutdown > NoShutdown && pmState == PM_NO_CHILDREN) + if (!demoting && Shutdown > NoShutdown && pmState == PM_NO_CHILDREN) { if (FatalError) { @@ -3996,13 +3964,23 @@ PostmasterStateMachine(void) ExitPostmaster(1); /* - * If we need to recover from a crash, wait for all non-syslogger children - * to exit, then reset shmem and StartupDataBase. + * If we need to recover from a crash or demoting, wait for all + * non-syslogger children to exit, then reset shmem and StartupDataBase. */ - if (FatalError && pmState == PM_NO_CHILDREN) + if ((demoting || FatalError) && pmState == PM_NO_CHILDREN) { - ereport(LOG, - (errmsg("all server processes terminated; reinitializing"))); + if (demoting) + ereport(LOG, + (errmsg("all server processes terminated; starting as standby"))); + else + ereport(LOG, + (errmsg("all server processes terminated; reinitializing"))); + + if (demoting) + { + Shutdown = NoShutdown; + demoting = false; + } /* allow background workers to immediately restart */ ResetBackgroundWorkerCrashTimes();
pgsql-hackers by date: