Re: Some problems of recovery conflict wait events - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Some problems of recovery conflict wait events
Date
Msg-id daa8896c-abba-95ad-65e5-269b52e619ed@oss.nttdata.com
Whole thread Raw
In response to Re: Some problems of recovery conflict wait events  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: Some problems of recovery conflict wait events  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers

On 2020/04/02 15:54, Masahiko Sawada wrote:
> On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/04/02 14:25, Masahiko Sawada wrote:
>>> On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2020/03/30 20:10, Masahiko Sawada wrote:
>>>>> On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2020/03/04 14:31, Masahiko Sawada wrote:
>>>>>>> On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/04 13:27, Michael Paquier wrote:
>>>>>>>>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:
>>>>>>>>>> Yeah, so 0001 patch sets existing wait events to recovery conflict
>>>>>>>>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
>>>>>>>>>> to the recovery conflict on a snapshot. 0003 patch improves these wait
>>>>>>>>>> events by adding the new type of wait event such as
>>>>>>>>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
>>>>>>>>>> is the fix for existing versions and 0003 patch is an improvement for
>>>>>>>>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching?
>>>>>>>>
>>>>>>>> Yes, it looks like a improvement rather than bug fix.
>>>>>>>>
>>>>>>>
>>>>>>> Okay, understand.
>>>>>>>
>>>>>>>>> I got my eyes on this patch set.  The full patch set is in my opinion
>>>>>>>>> just a set of improvements, and not bug fixes, so I would refrain from
>>>>>>>>> back-backpatching.
>>>>>>>>
>>>>>>>> I think that the issue (i.e., "waiting" is reported twice needlessly
>>>>>>>> in PS display) that 0002 patch tries to fix is a bug. So it should be
>>>>>>>> fixed even in the back branches.
>>>>>>>
>>>>>>> So we need only two patches: one fixes process title issue and another
>>>>>>> improve wait event. I've attached updated patches.
>>>>>>
>>>>>> I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
>>>>>>
>>>>>> -       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
>>>>>> +       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
>>>>>>
>>>>>> Currently the wait event indicating the wait for buffer pin has already
>>>>>> been reported. But the above change in the patch changes the name of
>>>>>> wait event for buffer pin only in the startup process. Is this really useful?
>>>>>> Isn't the existing wait event for buffer pin enough?
>>>>>>
>>>>>> -       /* Wait to be signaled by the release of the Relation Lock */
>>>>>> -       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>>>> +               /* Wait to be signaled by the release of the Relation Lock */
>>>>>> +               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
>>>>>>
>>>>>> Same as above. Isn't the existing wait event enough?
>>>>>
>>>>> Yeah, we can use the existing wait events for buffer pin and lock.
>>>>>
>>>>>>
>>>>>> -       /*
>>>>>> -        * Progressively increase the sleep times, but not to more than 1s, since
>>>>>> -        * pg_usleep isn't interruptible on some platforms.
>>>>>> -        */
>>>>>> -       standbyWait_us *= 2;
>>>>>> -       if (standbyWait_us > 1000000)
>>>>>> -               standbyWait_us = 1000000;
>>>>>> +       WaitLatch(MyLatch,
>>>>>> +                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
>>>>>> +                         STANDBY_WAIT_MS,
>>>>>> +                         wait_event_info);
>>>>>> +       ResetLatch(MyLatch);
>>>>>>
>>>>>> ResetLatch() should be called before WaitLatch()?
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>> Could you tell me why you dropped the "increase-sleep-times" logic?
>>>>>
>>>>> I thought we can remove it because WaitLatch is interruptible but my
>>>>> observation was not correct. The waiting startup process is not
>>>>> necessarily woken up by signal. I think it's still better to not wait
>>>>> more than 1 sec even if it's an interruptible wait.
>>>>
>>>> So we don't need to use WaitLatch() there, i.e., it's ok to keep using
>>>> pg_usleep()?
>>>>
>>>>> Attached patch fixes the above and introduces only two wait events of
>>>>> conflict resolution: snapshot and tablespace.
>>>>
>>>> Many thanks for updating the patch!
>>>>
>>>> -       /* Wait to be signaled by the release of the Relation Lock */
>>>> -       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>> +               /* Wait to be signaled by the release of the Relation Lock */
>>>> +               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>> +       }
>>>>
>>>> Is this change really valid? What happens if the latch is set during
>>>> ResolveRecoveryConflictWithVirtualXIDs()?
>>>> ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
>>>> is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
>>>
>>> Thank you for reviewing the patch!
>>>
>>> You're right. It's better to keep using pg_usleep() and set the wait
>>> event by pgstat_report_wait_start().
>>>
>>>>
>>>> +               default:
>>>> +                       event_name = "unknown wait event";
>>>> +                       break;
>>>>
>>>> Seems this default case should be removed. Please see other
>>>> pgstat_get_wait_xxx() function, so there is no such code.
>>>>
>>>>> I also removed the wait
>>>>> event of conflict resolution of database since it's unlikely to become
>>>>> a user-visible and a long sleep as we discussed before.
>>>>
>>>> Is it worth defining new wait event type RecoveryConflict only for
>>>> those two events? ISTM that it's ok to use IPC event type here.
>>>>
>>>
>>> I dropped a new wait even type and added them to IPC wait event type.
>>>
>>> I've attached the new version patch.
>>
>> Thanks for updating the patch! The patch looks good to me except
>> the following mior things.
>>
>> +        <row>
>> +         <entry><literal>RecoveryConflictSnapshot</literal></entry>
>> +         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
>> +        </row>
>> +        <row>
>> +         <entry><literal>RecoveryConflictTablespace</literal></entry>
>> +         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
>> +        </row>
>>
>> You need to increment the value of "morerows" in
>> "<entry morerows="38"><literal>IPC</literal></entry>".
>>
>> The descriptions of those two events should be placed in alphabetical order
>> for event name. That is, they should be placed above RecoveryPause.
>>
>> "vacuum cleanup" is better than "physical cleanup"?
> 
> Agreed.
> 
> I've attached the updated version patch.

Thanks! Looks good to me. Barring any objection, I will commit this patch.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgbench - add pseudo-random permutation function
Next
From: Michael Paquier
Date:
Subject: Re: color by default