Re: Add Information during standby recovery conflicts - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Add Information during standby recovery conflicts |
Date | |
Msg-id | CAD21AoC2K8=Du2hLhWGXwq31CH6K7pWxDYAiz4Tmzs5yyfMohA@mail.gmail.com Whole thread Raw |
In response to | Re: Add Information during standby recovery conflicts ("Drouvot, Bertrand" <bdrouvot@amazon.com>) |
Responses |
Re: Add Information during standby recovery conflicts
|
List | pgsql-hackers |
On Fri, Oct 30, 2020 at 6:02 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote: > > Hi, > > On 10/30/20 4:25 AM, Fujii Masao wrote: > > CAUTION: This email originated from outside of the organization. Do > > not click links or open attachments unless you can confirm the sender > > and know the content is safe. > > > > > > > > On 2020/10/30 10:29, Masahiko Sawada wrote: > >> , > >> > >> On Thu, 29 Oct 2020 at 00:16, Fujii Masao > >> <masao.fujii@oss.nttdata.com> wrote: > >>> > >>> I read v8 patch. Here are review comments. > >> > >> Thank you for your review. > >> Thank you for updating the patch! Looking at the latest version patch (v8-0002-Log-the-standby-recovery-conflict-waits.patch), I think it doesn't address some comments from Fujii-san. > >>> > >>> When recovery conflict with buffer pin happens, log message is output > >>> every deadlock_timeout. Is this intentional behavior? If yes, IMO > >>> that's > >>> not good because lots of messages can be output. > >> > >> Agreed. I think the latest patch doesn't fix the above comment. Log message for recovery conflict on buffer pin is logged every deadlock_timeout. > >> > >> if we were to log the recovery conflict only once in bufferpin > >> conflict case, we would log it multiple times only in lock conflict > >> case. So I guess it's better to do that in all conflict cases. > > > > Yes, I agree that this behavior basically should be consistent between > > all cases. The latest patch seems not to address this comment as well. > > > >> > >>> > >>> + if (log_recovery_conflict_waits) > >>> + waitStart = GetCurrentTimestamp(); > >>> > >>> What happens if log_recovery_conflict_waits is off at the beginning and > >>> then it's changed during waiting for the conflict? In this case, > >>> waitStart is > >>> zero, but log_recovery_conflict_waits is on. This may cause some > >>> problems? > >> > >> Hmm, I didn't see a path that happens to reload the config file during > >> waiting for buffer cleanup lock. Even if the startup process receives > >> SIGHUP during that, it calls HandleStartupProcInterrupts() at the next > >> convenient time. It could be the beginning of main apply loop or > >> inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in > >> the wait loop for taking a buffer cleanup. > > > > Yes, you're right. I seem to have read the code wrongly. > > > >> However, I think it’s > >> better to use (waitStart > 0) for safety when checking if we log the > >> recovery conflict instead of log_recovery_conflict_waits. > >> > >>> > >>> + if (report_waiting) > >>> + ts = GetCurrentTimestamp(); > >>> > >>> GetCurrentTimestamp() doesn't need to be called every cycle > >>> in the loop after "logged" is true and "new_status" is not NULL. > >> > >> Agreed > >> > >>> > >>> +extern const char *get_procsignal_reason_desc(ProcSignalReason > >>> reason); > >>> > >>> Is this garbage? > >> > >> Yes. > >> > >>> > >>> When log_lock_waits is enabled, both "still waiting for ..." and > >>> "acquired ..." > >>> messages are output. Like this, when log_recovery_conflict_waits is > >>> enabled, > >>> not only "still waiting ..." but also "resolved ..." message should > >>> be output? > >>> The latter message might not need to be output if the conflict is > >>> canceled > >>> due to max_standby_xxx_delay parameter. The latter message would be > >>> useful to know how long the recovery was waiting for the conflict. > >>> Thought? > >>> It's ok to implement this as a separate patch later, though. > >> > >> There was a discussion that the latter message without waiting time is > >> not necessarily needed because the canceled process will log > >> "canceling statement due to conflict with XXX" as you mentioned. I > >> agreed with that. But I agree that the latter message with waiting > >> time would help users, for instance when the startup process is > >> waiting for multiple processes and it takes a time to cancel all of > >> them. > > > > I agree that it's useful to output the wait time. > > > > But as I told in previous email, it's ok to focus on the current patch > > for now and then implement this as a separate patch later. Agreed. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/
pgsql-hackers by date: