Re: RecoveryWalAll and RecoveryWalStream wait events - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: RecoveryWalAll and RecoveryWalStream wait events
Date
Msg-id 2ea0c6f1-adcd-f2a2-fe5a-92376ad38432@oss.nttdata.com
Whole thread Raw
In response to Re: RecoveryWalAll and RecoveryWalStream wait events  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: RecoveryWalAll and RecoveryWalStream wait events
List pgsql-hackers

On 2020/02/18 14:20, Kyotaro Horiguchi wrote:
> Hello.
> 
> At Tue, 18 Feb 2020 12:25:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Hi,
>>
>> RecoveryWalAll and RecoveryWalStream wait events are documented as
>> follows.
>>
>>      RecoveryWalAll
>>      Waiting for WAL from any kind of source (local, archive or stream) at
>>      recovery.
>>
>>      RecoveryWalStream
>>      Waiting for WAL from a stream at recovery.
>>
>> But as far as I read the code, RecoveryWalAll is reported only when
>> waiting
>> for WAL from a stream. So the current description looks
>> incorrect. What's
>> described now for RecoveryWalStream seems rather fit to
>> RecoveryWalAll.
>> I'd like to change the description of RecoveryWalAll to "Waiting for
>> WAL
>>   from a stream at recovery".
> 
> Good catch!
> 
>> Regarding RecoveryWalStream, as far as I read the code, while this
>> event is
>> being reported, the startup process is waiting for next trial to
>> retrieve
>> WAL data when WAL data is not available from any sources, based on
>> wal_retrieve_retry_interval. So this current description looks also
>> incorrect. I'd like to change it to "Waiting when WAL data is not
>> available
>>   from any kind of sources (local, archive or stream) before trying
>>   again
>>   to retrieve WAL data".
>>
>> Thought?
> 
> I agree that the corrected description sound correct in meaning. The
> latter seems a bit lengthy, though.

Yeah, so better idea?

Anyway, attached is the patch (fix_recovery_wait_event_doc_v1.patch)
that fixes the descriptions of those wait events. This should be
back-patched to v9.5.

>> Also the current names of these wait events sound confusing. I think
>> that RecoveryWalAll should be changed to RecoveryWalStream.
>> RecoveryWalStream should be RecoveryRetrieveRetryInterval or
>> something.
> 
> I agree to the former, I think RecoveryWalInterval works well enough.

RecoveryWalInterval sounds confusing to me...

Attached is the patch (improve_recovery_wait_event_for_master_v1.patch) that
changes the names and types of wait events. This patch uses
RecoveryRetrieveRetryInterval, but if there is better name,
I will adopt that.

Note that this patch needs to be applied after
fix_recovery_wait_event_doc_v1.patch is applied.

>> Another problem is that the current wait event types of them also look
>> strange. Currently the type of them is Activity, but IMO it's better
>> to
>> use IPC for RecoveryWalAll because it's waiting for walreceiver to
>> receive new WAL. Also it's better to use Timeout for RecoveryWalStream
>> because it's waiting depending on wal_retrieve_retry_interval.
> 
> Do you mean condition variable by the "IPC"? But the WaitLatch waits
> not only for new WAL but also for trigger, SIGHUP, shutdown and
> walreceiver events other than new WAL. I'm not sure that condition
> variable fits for the purpose.

OK, I didn't change the type of RecoveryWalStream to IPC, in the patch.
Its type is still Activity.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: small improvement of the elapsed time for truncating heap invacuum
Next
From: Fujii Masao
Date:
Subject: Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side