Skip checkpoint on promoting from streaming replication - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Skip checkpoint on promoting from streaming replication
Date
Msg-id 20120608.172201.126345187.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
Responses Re: Skip checkpoint on promoting from streaming replication
List pgsql-hackers
Hello,

I have a problem with promoting from hot-standby that exclusive
checkpointing retards completion of promotion.

This checkpoint is "shutdown checkpoint" as a convention in
realtion to TLI increment according to the comment shown below. I
suppose "shutdown checkpoint" means exclusive checkpoint - in
other words, checkpoint without WAL inserts meanwhile.

>      * one. This is not particularly critical, but since we may be
>      * assigning a new TLI, using a shutdown checkpoint allows us to have
>      * the rule that TLI only changes in shutdown checkpoints, which
>      * allows some extra error checking in xlog_redo.

I depend on this and suppose we can omit it if latest checkpoint
has been taken so as to be able to do crash recovery thereafter.
This condition could be secured by my another patch for
checkpoint_segments on standby.

After applying this patch, checkpoint after archive recovery at
near the end of StartupXLOG() will be skiped on the condition
follows,

- WAL receiver has been launched so far. (using WalRcvStarted())

- XLogCheckpointNeeded() against replayEndRecPtr says no need of checkpoint.

What do you think about this?


This patch needs WalRcvStarted() introduced by my another patch.

http://archives.postgresql.org/pgsql-hackers/2012-06/msg00287.php

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0f2678c..48c0cf6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6905,9 +6905,41 @@ StartupXLOG(void)         * allows some extra error checking in xlog_redo.         */        if
(bgwriterLaunched)
-            RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY |
-                              CHECKPOINT_IMMEDIATE |
-                              CHECKPOINT_WAIT);
+        {
+            bool do_checkpoint = true;
+
+            if (WalRcvStarted())
+            {
+                /*
+                 * This shutdown checkpoint on promotion should retards
+                 * failover completion. In spite of the rule for TLI and
+                 * shutdown checkpoint mentioned above, we want to skip this
+                 * checkpoint securing recoveribility by crash recovery after
+                 * this point.
+                 */
+                uint32 replayEndId = 0;
+                uint32 replayEndSeg = 0;
+                XLogRecPtr replayEndRecPtr;
+                /* use volatile pointer to prevent code rearrangement */
+                volatile XLogCtlData *xlogctl = XLogCtl;
+
+                SpinLockAcquire(&xlogctl->info_lck);
+                replayEndRecPtr = xlogctl->replayEndRecPtr;
+                SpinLockRelease(&xlogctl->info_lck);
+                XLByteToSeg(replayEndRecPtr, replayEndId, replayEndSeg);
+                if (!XLogCheckpointNeeded(replayEndId, replayEndSeg))
+                {
+                    do_checkpoint = false;
+                    ereport(LOG,
+                            (errmsg("Checkpoint on recovery end was skipped")));
+                }
+            }
+            
+            if (do_checkpoint)
+                RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY |
+                                  CHECKPOINT_IMMEDIATE |
+                                  CHECKPOINT_WAIT);
+        }        else            CreateCheckPoint(CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IMMEDIATE);

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Checkpointer on hot standby runs without looking checkpoint_segments
Next
From: Simon Riggs
Date:
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)