Re: Add Information during standby recovery conflicts - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Add Information during standby recovery conflicts
Date
Msg-id 20201215.120435.225938652181905969.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Add Information during standby recovery conflicts  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Add Information during standby recovery conflicts
List pgsql-hackers
At Tue, 15 Dec 2020 02:00:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> Thanks for the review! I'm thinking to wait half a day before
> commiting
> the patch just in the case someone may object the patch.

Sorry for coming late.  I have looked only the latest thread so I
should be missing many things so please ignore if it was settled in
the past discussion.


It emits messages like the follows;

[40509:startup] LOG:  recovery still waiting after 1021.431 ms: recovery conflict on lock
[40509:startup] DETAIL:  Conflicting processes: 41171, 41194.
[40509:startup] CONTEXT:  WAL redo at 0/3013158 for Standby/LOCK: xid 510 db 13609 rel 16384 

IFAIK DETAIL usually shows ordinary sentences so the first word is
capitalized and ends with a period. But it is not a sentence so
following period looks odd.  a searcheing the tree for errdetails
showed some anomalies.

src/backend/parser/parse_param.c                      errdetail("%s versus %s",
src/backend/jit/llvm/llvmjit_error.cpp              errdetail("while in LLVM")));
src/backend/replication/logical/tablesync.c                  errdetail("The error was: %s", res->err)));
src/backend/tcop/postgres.c                 errdetail("prepare: %s", pstmt->plansource->query_string);
src/backend/tcop/postgres.c         errdetail("abort reason: recovery conflict");

and one similar occurance:

src/backend/utils/adt/dbsize.c                      errdetail("Invalid size unit: \"%s\".", strptr),

I'm not sure, but it seems to me at least the period is unnecessary
here.


+            bool        maybe_log_conflict =
+            (standbyWaitStart != 0 && !logged_recovery_conflict);

odd indentation.


+        /* Also, set the timer if necessary */
+        if (logging_timer)
+        {
+            timeouts[cnt].id = STANDBY_LOCK_TIMEOUT;
+            timeouts[cnt].type = TMPARAM_AFTER;
+            timeouts[cnt].delay_ms = DeadlockTimeout;
+            cnt++;
+        }

This doesn't consider spurious wakeup. I'm not sure it actually
happenes but we usually consider that.  That is since before this
patch, but ProcWaitForSignal()'s comment says that:

>  * As this uses the generic process latch the caller has to be robust against
>  * unrelated wakeups: Always check that the desired state has occurred, and
>  * wait again if not.

If we don't care of spurious wakeups, we should add such a comment.


+                bool        maybe_log_conflict;
+                bool        maybe_update_title;

Although it should be a matter of taste and I understand that the
"maybe" means that "that logging and changing of ps display may not
happen in this iteration" , that variables seem expressing
respectively "we should write log if the timeout for recovery conflict
expires" and "we should update title if 500ms elapsed". So they seem
*to me* better be just "log_conflict" and "update_title".

I feel the same with  "maybe_log_conflict" in ProcSleep().


+        for recovery conflicts.  This is useful in determining if recovery
+        conflicts prevent the recovery from applying WAL.

(I'm not confident on this) Isn't the sentence better be in past or
present continuous tense?


> BTW, attached is the POC patch that implements the idea discussed
> upthread;
> if log_recovery_conflict_waits is enabled, the startup process reports
> the log also after the recovery conflict was resolved and the startup
> process
> finished waiting for it. This patch needs to be applied after
> v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied.

Ah. I was just writing a comment about that. I haven't looked it
closer but it looks good to me.  By the way doesn't it contains a
simple fix of a comment for the base patch?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
Next
From: Bharath Rupireddy
Date:
Subject: Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped