Re: Add Information during standby recovery conflicts - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Add Information during standby recovery conflicts
Date
Msg-id CAD21AoDM-KW8aKDUBjF6H7W8_rddMVP24wMxzDX6LzJ1SNyP5Q@mail.gmail.com
Whole thread Raw
In response to Re: Add Information during standby recovery conflicts  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Responses Re: Add Information during standby recovery conflicts  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
List pgsql-hackers
On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>
> Hi,
>
> On 11/17/20 4:44 PM, Fujii Masao wrote:
> >
> > Thanks for updating the patch! Here are review comments.
> >
> > +        Controls whether a log message is produced when the startup
> > process
> > +        is waiting longer than <varname>deadlock_timeout</varname>
> > +        for recovery conflicts.
> >
> > But a log message can be produced also when the backend is waiting
> > for recovery conflict. Right? If yes, this description needs to be
> > corrected.
>
> Thanks for looking at it!
>
> I don't think so, only the startup process should write those new log
> messages.
>
> What makes you think that would not be the case?
>
> >
> >
> > +        for recovery conflicts.  This is useful in determining if
> > recovery
> > +        conflicts prevents the recovery from applying WAL.
> >
> > "prevents" should be "prevent"?
>
> Indeed: fixed in the new attached patch.
>
> >
> >
> > +       TimestampDifference(waitStart, GetCurrentTimestamp(), &secs,
> > &usecs);
> > +       msecs = secs * 1000 + usecs / 1000;
> >
> > GetCurrentTimestamp() is basically called before LogRecoveryConflict()
> > is called. So isn't it better to avoid calling GetCurrentTimestamp()
> > newly in
> > LogRecoveryConflict() and to reuse the timestamp that we got?
> > It's helpful to avoid the waste of cycles.
> >
> good catch! fixed in the new attached patch.
>
> >
> > +               while (VirtualTransactionIdIsValid(*vxids))
> > +               {
> > +                       PGPROC *proc =
> > BackendIdGetProc(vxids->backendId);
> >
> > BackendIdGetProc() can return NULL if the backend is not active
> > at that moment. This case should be handled.
> >
> handled in the new attached patch.
> >
> > +               case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
> > +                       reasonDesc = gettext_noop("recovery is still
> > waiting recovery conflict on buffer pin");
> >
> > It's natural to use "waiting for recovery" rather than "waiting
> > recovery"?
> >
> I would be tempted to say so, the new patch makes use of "waiting for".
> >
> > +               /* Also, set deadlock timeout for logging purpose if
> > necessary */
> > +               if (log_recovery_conflict_waits)
> > +               {
> > +                       timeouts[cnt].id = STANDBY_TIMEOUT;
> > +                       timeouts[cnt].type = TMPARAM_AFTER;
> > +                       timeouts[cnt].delay_ms = DeadlockTimeout;
> > +                       cnt++;
> > +               }
> >
> > This needs to be executed only when the message has not been logged yet.
> > Right?
> >
> good catch: fixed in the new attached patch.
>

Thank you for updating the patch! Here are review comments on the
latest version patch.

+       while (VirtualTransactionIdIsValid(*vxids))
+       {
+           PGPROC *proc = BackendIdGetProc(vxids->backendId);
+
+           if (proc)
+           {
+               if (nprocs == 0)
+                   appendStringInfo(&buf, "%d", proc->pid);
+               else
+                   appendStringInfo(&buf, ", %d", proc->pid);
+
+               nprocs++;
+               vxids++;
+           }
+       }

We need to increment vxids even if *proc is null. Otherwise, the loop won't end.

---
+               TimestampTz cur_ts = GetCurrentTimestamp();;

There is an extra semi-colon.

---
 int            max_standby_streaming_delay = 30 * 1000;
+bool       log_recovery_conflict_waits = false;
+bool       logged_lock_conflict = false;


+       if (log_recovery_conflict_waits && !logged_lock_conflict)
+       {
+           timeouts[cnt].id = STANDBY_TIMEOUT;
+           timeouts[cnt].type = TMPARAM_AFTER;
+           timeouts[cnt].delay_ms = DeadlockTimeout;
+           cnt++;
+       }

Can we pass a bool indicating if a timeout may be needed for recovery
conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock()
instead of using a static variable?

Regards,

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: 曾文旌
Date:
Subject: Re: [Proposal] Global temporary tables
Next
From: Daniel Gustafsson
Date:
Subject: Re: Online checksums patch - once again