Re: [patch] demote - Mailing list pgsql-hackers
| From | Kyotaro Horiguchi |
|---|---|
| Subject | Re: [patch] demote |
| Date | |
| Msg-id | 20200626.161438.1461210156318414119.horikyota.ntt@gmail.com Whole thread Raw |
| In response to | Re: [patch] demote (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>) |
| Responses |
Re: [patch] demote
Re: [patch] demote |
| List | pgsql-hackers |
Hello.
At Thu, 25 Jun 2020 19:27:54 +0200, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote in
> Here is a summary of my work during the last few days on this demote approach.
>
> Please, find in attachment v2-0001-Demote-PoC.patch and the comments in the
> commit message and as FIXME in code.
>
> The patch is not finished or bug-free yet, I'm still not very happy with the
> coding style, it probably lack some more code documentation, but a lot has
> changed since v1. It's still a PoC to push the discussion a bit further after
> being myself silent for some days.
>
> The patch is currently relying on a demote checkpoint. I understand a forced
> checkpoint overhead can be massive and cause major wait/downtime. But I keep
> this for a later step. Maybe we should be able to cancel a running checkpoint?
> Or leave it to its synching work but discard the result without wirting it to
> XLog?
If we are going to dive so close to server shutdown, we can just
utilize the restart-after-crash path, which we can assume to work
reliably. The attached is a quite rough sketch, hijacking smart
shutdown path for a convenience, of that but seems working. "pg_ctl
-m s -W stop" lets server demote.
> I hadn't time to investigate Robert's concern about shared memory for snapshot
> during recovery.
The patch does all required clenaup of resources including shared
memory, I believe. It's enough if we don't need to keep any resources
alive?
> The patch doesn't deal with prepared xact yet. Testing "start->demote->promote"
> raise an assert if some prepared xact exist. I suppose I will rollback them
> during demote in next patch version.
>
> I'm not sure how to divide this patch in multiple small independent steps. I
> suppose I can split it like:
>
> 1. add demote checkpoint
> 2. support demote: mostly postmaster, startup/xlog and checkpointer related
> code
> 3. cli using pg_ctl demote
>
> ...But I'm not sure it worth it.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index b4d475bb0b..a4adf3e587 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2752,6 +2752,7 @@ SIGHUP_handler(SIGNAL_ARGS)
/*
* pmdie -- signal handler for processing various postmaster signals.
*/
+static bool demoting = false;
static void
pmdie(SIGNAL_ARGS)
{
@@ -2774,59 +2775,17 @@ pmdie(SIGNAL_ARGS)
case SIGTERM:
/*
- * Smart Shutdown:
+ * XXX: Hijacked as DEMOTE
*
- * Wait for children to end their work, then shut down.
+ * Runs fast shutdown, then restart as standby
*/
if (Shutdown >= SmartShutdown)
break;
Shutdown = SmartShutdown;
ereport(LOG,
- (errmsg("received smart shutdown request")));
-
- /* Report status */
- AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
-#ifdef USE_SYSTEMD
- sd_notify(0, "STOPPING=1");
-#endif
-
- if (pmState == PM_RUN || pmState == PM_RECOVERY ||
- pmState == PM_HOT_STANDBY || pmState == PM_STARTUP)
- {
- /* autovac workers are told to shut down immediately */
- /* and bgworkers too; does this need tweaking? */
- SignalSomeChildren(SIGTERM,
- BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER);
- /* and the autovac launcher too */
- if (AutoVacPID != 0)
- signal_child(AutoVacPID, SIGTERM);
- /* and the bgwriter too */
- if (BgWriterPID != 0)
- signal_child(BgWriterPID, SIGTERM);
- /* and the walwriter too */
- if (WalWriterPID != 0)
- signal_child(WalWriterPID, SIGTERM);
-
- /*
- * If we're in recovery, we can't kill the startup process
- * right away, because at present doing so does not release
- * its locks. We might want to change this in a future
- * release. For the time being, the PM_WAIT_READONLY state
- * indicates that we're waiting for the regular (read only)
- * backends to die off; once they do, we'll kill the startup
- * and walreceiver processes.
- */
- pmState = (pmState == PM_RUN) ?
- PM_WAIT_BACKUP : PM_WAIT_READONLY;
- }
-
- /*
- * Now wait for online backup mode to end and backends to exit. If
- * that is already the case, PostmasterStateMachine will take the
- * next step.
- */
- PostmasterStateMachine();
- break;
+ (errmsg("received demote request")));
+ demoting = true;
+ /* FALL THROUGH */
case SIGINT:
@@ -2839,8 +2798,10 @@ pmdie(SIGNAL_ARGS)
if (Shutdown >= FastShutdown)
break;
Shutdown = FastShutdown;
- ereport(LOG,
- (errmsg("received fast shutdown request")));
+
+ if (!demoting)
+ ereport(LOG,
+ (errmsg("received fast shutdown request")));
/* Report status */
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
@@ -2887,6 +2848,13 @@ pmdie(SIGNAL_ARGS)
pmState = PM_WAIT_BACKENDS;
}
+ /* create standby signal file */
+ {
+ FILE *standby_file = AllocateFile(STANDBY_SIGNAL_FILE, "w");
+
+ Assert (standby_file && !FreeFile(standby_file));
+ }
+
/*
* Now wait for backends to exit. If there are none,
* PostmasterStateMachine will take the next step.
@@ -3958,7 +3926,7 @@ PostmasterStateMachine(void)
* EOF on its input pipe, which happens when there are no more upstream
* processes.
*/
- if (Shutdown > NoShutdown && pmState == PM_NO_CHILDREN)
+ if (!demoting && Shutdown > NoShutdown && pmState == PM_NO_CHILDREN)
{
if (FatalError)
{
@@ -3996,13 +3964,23 @@ PostmasterStateMachine(void)
ExitPostmaster(1);
/*
- * If we need to recover from a crash, wait for all non-syslogger children
- * to exit, then reset shmem and StartupDataBase.
+ * If we need to recover from a crash or demoting, wait for all
+ * non-syslogger children to exit, then reset shmem and StartupDataBase.
*/
- if (FatalError && pmState == PM_NO_CHILDREN)
+ if ((demoting || FatalError) && pmState == PM_NO_CHILDREN)
{
- ereport(LOG,
- (errmsg("all server processes terminated; reinitializing")));
+ if (demoting)
+ ereport(LOG,
+ (errmsg("all server processes terminated; starting as standby")));
+ else
+ ereport(LOG,
+ (errmsg("all server processes terminated; reinitializing")));
+
+ if (demoting)
+ {
+ Shutdown = NoShutdown;
+ demoting = false;
+ }
/* allow background workers to immediately restart */
ResetBackgroundWorkerCrashTimes();
pgsql-hackers by date: