Re: Add Information during standby recovery conflicts - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Add Information during standby recovery conflicts
Date
Msg-id 0c8a89ae-27bf-2c31-e74a-cf04033d7daf@oss.nttdata.com
Whole thread Raw
In response to Re: Add Information during standby recovery conflicts  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Responses Re: Add Information during standby recovery conflicts  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
List pgsql-hackers

On 2020/11/17 17:23, Drouvot, Bertrand wrote:
> 
> On 11/17/20 2:09 AM, Masahiko Sawada wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>>
>>
>>
>> On Mon, Nov 16, 2020 at 4:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>>> Hi,
>>>
>>> On 11/16/20 6:44 AM, Masahiko Sawada wrote:
>>>> Thank you for updating the patch.
>>>>
>>>> Here are review comments.
>>>>
>>>> +           if (report_waiting && (!logged_recovery_conflict ||
>>>> new_status == NULL))
>>>> +               ts = GetCurrentTimestamp();
>>>>
>>>> The condition will always be true if log_recovery_conflict_wait is
>>>> false and report_waiting is true, leading to unnecessary calling of
>>>> GetCurrentTimestamp().
>>>>
>>>> ---
>>>> +   <para>
>>>> +    You can control whether a log message is produced when the startup process
>>>> +    is waiting longer than <varname>deadlock_timeout</varname> for recovery
>>>> +    conflicts. This is controled by the <xref
>>>> linkend="guc-log-recovery-conflict-waits"/>
>>>> +    parameter.
>>>> +   </para>
>>>>
>>>> s/controled/controlled/
>>>>
>>>> ---
>>>>       if (report_waiting)
>>>>           waitStart = GetCurrentTimestamp();
>>>>
>>>> Similarly, we have the above code but we don't need to call
>>>> GetCurrentTimestamp() if update_process_title is false, even if
>>>> report_waiting is true.
>>>>
>>>> I've attached the patch that fixes the above comments. It can be
>>>> applied on top of your v8 patch.
>>> Thanks for the review and the associated fixes!
>>>
>>> I've attached a new version that contains your fixes.
>>>
>> Thank you for updating the patch.
>>
>> I have other comments:
>>
>> +   <para>
>> +    You can control whether a log message is produced when the startup process
>> +    is waiting longer than <varname>deadlock_timeout</varname> for recovery
>> +    conflicts. This is controlled by the
>> +    <xref linkend="guc-log-recovery-conflict-waits"/> parameter.
>> +   </para>
>>
>> It would be better to use 'WAL replay' instead of 'the startup
>> process' for consistency with circumjacent descriptions. What do you
>> think?
> 
> Agree that the wording is more appropriate.
> 
>>
>> ---
>> @@ -1260,6 +1262,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod
>> lockMethodTable)
>>    else
>>    enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout);
>>    }
>> + else
>> +   standbyWaitStart = GetCurrentTimestamp();
>>
>> I think we can add a check of log_recovery_conflict_waits to avoid
>> unnecessary calling of GetCurrentTimestamp().
>>
>> I've attached the updated version patch including the above comments
>> as well as adding some assertions. Please review it.
>>
> That looks all good to me.
> 
> Thanks a lot for your precious help!

Thanks for updating the patch! Here are review comments.

+        Controls whether a log message is produced when the startup process
+        is waiting longer than <varname>deadlock_timeout</varname>
+        for recovery conflicts.

But a log message can be produced also when the backend is waiting
for recovery conflict. Right? If yes, this description needs to be corrected.


+        for recovery conflicts.  This is useful in determining if recovery
+        conflicts prevents the recovery from applying WAL.

"prevents" should be "prevent"?


+    TimestampDifference(waitStart, GetCurrentTimestamp(), &secs, &usecs);
+    msecs = secs * 1000 + usecs / 1000;

GetCurrentTimestamp() is basically called before LogRecoveryConflict()
is called. So isn't it better to avoid calling GetCurrentTimestamp() newly in
LogRecoveryConflict() and to reuse the timestamp that we got?
It's helpful to avoid the waste of cycles.


+        while (VirtualTransactionIdIsValid(*vxids))
+        {
+            PGPROC *proc = BackendIdGetProc(vxids->backendId);

BackendIdGetProc() can return NULL if the backend is not active
at that moment. This case should be handled.


+        case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
+            reasonDesc = gettext_noop("recovery is still waiting recovery conflict on buffer pin");

It's natural to use "waiting for recovery" rather than "waiting recovery"?


+        /* Also, set deadlock timeout for logging purpose if necessary */
+        if (log_recovery_conflict_waits)
+        {
+            timeouts[cnt].id = STANDBY_TIMEOUT;
+            timeouts[cnt].type = TMPARAM_AFTER;
+            timeouts[cnt].delay_ms = DeadlockTimeout;
+            cnt++;
+        }

This needs to be executed only when the message has not been logged yet.
Right?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: 方徳輝
Date:
Subject: Is postgres ready for 2038?
Next
From: Heikki Linnakangas
Date:
Subject: Re: Protect syscache from bloating with negative cache entries