Checkpointer on hot standby runs without looking checkpoint_segments - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Checkpointer on hot standby runs without looking checkpoint_segments
Date
Msg-id 20120608.171448.141651827.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Checkpointer on hot standby runs without looking checkpoint_segments
Re: Checkpointer on hot standby runs without looking checkpoint_segments
List pgsql-hackers
Hello, I will make this patch start again for this CF.

The requirement for this patch is as follows.

- What I want to get is similarity of the behaviors between master and (hot-)standby concerning checkpoint progression.
Specifically,checkpoints for streaming replication running at the speed governed with checkpoint_segments. The work of
thispatch is avoiding to get unexpectedly large number of WAL segments stay on standby side. (Plus, increasing the
chanceto skip recovery-end checkpoint by my another patch.)
 

- This patch shouldn't affect archive recovery (excluding streaming). Activity of the checkpoints while recoverying
fromWAL archive (Precisely, while archive recovery without WAL receiver launched.) is depressed to checkpoint_timeout
levelas before.
 

- It might be better if the accelaration can be inhibited. But this patch does not have the feature. Is it needed?


After the consideration of the past discussion and the another
patch I'm going to put on the table, outline of this patch
becomes as follows.

- Check if it is under streaming replication by new function WalRcvStarted() which tells whether wal receiver has been
launchedso far.
 
 - The implement of WalRcvStarted() is changed from previous   one. Now the state is turned on in WalReceiverMain, at
the  point where the state of walRcvState becomes   WALRCV_RUNNING. The behavior of previous implement reading
WalRcvInProgress()was useless for my another patch.
 

- Determine whether to delay checkpoint by GetLogReplayRecPtr() instead of GetInsertRecPtr() when WalRcvStarted() says
true.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 6aeade9..cb2509a 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -46,6 +46,7 @@#include "miscadmin.h"#include "pgstat.h"#include "postmaster/bgwriter.h"
+#include "replication/walreceiver.h"#include "replication/syncrep.h"#include "storage/bufmgr.h"#include
"storage/ipc.h"
@@ -493,8 +494,8 @@ CheckpointerMain(void)             * Initialize checkpointer-private variables used during
checkpoint            */            ckpt_active = true;
 
-            if (!do_restartpoint)
-                ckpt_start_recptr = GetInsertRecPtr();
+            ckpt_start_recptr =
+                do_restartpoint ? GetXLogReplayRecPtr(NULL) : GetInsertRecPtr();            ckpt_start_time = now;
      ckpt_cached_elapsed = 0;
 
@@ -747,6 +748,7 @@ IsCheckpointOnSchedule(double progress)    struct timeval now;    double        elapsed_xlogs,
         elapsed_time;
 
+    bool        recovery_in_progress;    Assert(ckpt_active);
@@ -763,18 +765,26 @@ IsCheckpointOnSchedule(double progress)        return false;    /*
-     * Check progress against WAL segments written and checkpoint_segments.
+     * Check progress against WAL segments written, or replayed for
+     * hot standby, and checkpoint_segments.     *     * We compare the current WAL insert location against the
location    * computed before calling CreateCheckPoint. The code in XLogInsert that     * actually triggers a
checkpointwhen checkpoint_segments is exceeded
 
-     * compares against RedoRecptr, so this is not completely accurate.
-     * However, it's good enough for our purposes, we're only calculating an
-     * estimate anyway.
+     * compares against RedoRecPtr.  Similarly, we consult WAL replay location
+     * instead on hot standbys and XLogPageRead compares it aganst RedoRecPtr,
+     * too.  Altough these are not completely accurate, it's good enough for
+     * our purposes, we're only calculating an estimate anyway.
+     */
+
+    /*
+     * Inhibit governing progress by segments in archive recovery.     */
-    if (!RecoveryInProgress())
+    recovery_in_progress = RecoveryInProgress();
+    if (!recovery_in_progress || WalRcvStarted())    {
-        recptr = GetInsertRecPtr();
+        recptr = recovery_in_progress ? GetXLogReplayRecPtr(NULL) :
+            GetInsertRecPtr();        elapsed_xlogs =            (((double) (int32) (recptr.xlogid -
ckpt_start_recptr.xlogid))* XLogSegsPerFile +             ((double) recptr.xrecoff - (double)
ckpt_start_recptr.xrecoff)/ XLogSegSize) /
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index d63ff29..7d57ad7 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -215,6 +215,7 @@ WalReceiverMain(void)    /* Advertise our PID so that the startup process can kill us */
walrcv->pid= MyProcPid;    walrcv->walRcvState = WALRCV_RUNNING;
 
+    walrcv->started = true;    /* Fetch information required to start streaming */    strlcpy(conninfo, (char *)
walrcv->conninfo,MAXCONNINFO);
 
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index f8dd523..c3b26e9 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -31,6 +31,7 @@#include "utils/timestamp.h"WalRcvData *WalRcv = NULL;
+static bool localWalRcvStarted = false;/* * How long to wait for walreceiver to start up after requesting
@@ -167,6 +168,25 @@ ShutdownWalRcv(void)}/*
+ * Returns true if WAL receiver has been launced so far regardless of current
+ * state.
+ */
+bool
+WalRcvStarted(void)
+{
+    /* WalRcv->started changes one way throughout the server life */
+    if (!localWalRcvStarted)
+    {
+        volatile WalRcvData *walrcv = WalRcv;
+
+        SpinLockAcquire(&walrcv->mutex);
+        localWalRcvStarted = walrcv->started;
+        SpinLockRelease(&walrcv->mutex);
+    }
+    return localWalRcvStarted;
+}
+
+/* * Request postmaster to start walreceiver. * * recptr indicates the position where streaming should begin, and
conninfo
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 68c8647..24901be 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -53,6 +53,7 @@ typedef struct     */    pid_t        pid;    WalRcvState walRcvState;
+    bool        started;    pg_time_t    startTime;    /*
@@ -116,6 +117,7 @@ extern Size WalRcvShmemSize(void);extern void WalRcvShmemInit(void);extern void
ShutdownWalRcv(void);externbool WalRcvInProgress(void);
 
+extern bool WalRcvStarted(void);extern void RequestXLogStreaming(XLogRecPtr recptr, const char *conninfo);extern
XLogRecPtrGetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart);extern int GetReplicationApplyDelay(void); 

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: New Postgres committer: Kevin Grittner
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Skip checkpoint on promoting from streaming replication