On 2020/02/18 14:20, Kyotaro Horiguchi wrote:
> Hello.
>
> At Tue, 18 Feb 2020 12:25:51 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Hi,
>>
>> RecoveryWalAll and RecoveryWalStream wait events are documented as
>> follows.
>>
>> RecoveryWalAll
>> Waiting for WAL from any kind of source (local, archive or stream) at
>> recovery.
>>
>> RecoveryWalStream
>> Waiting for WAL from a stream at recovery.
>>
>> But as far as I read the code, RecoveryWalAll is reported only when
>> waiting
>> for WAL from a stream. So the current description looks
>> incorrect. What's
>> described now for RecoveryWalStream seems rather fit to
>> RecoveryWalAll.
>> I'd like to change the description of RecoveryWalAll to "Waiting for
>> WAL
>> from a stream at recovery".
>
> Good catch!
>
>> Regarding RecoveryWalStream, as far as I read the code, while this
>> event is
>> being reported, the startup process is waiting for next trial to
>> retrieve
>> WAL data when WAL data is not available from any sources, based on
>> wal_retrieve_retry_interval. So this current description looks also
>> incorrect. I'd like to change it to "Waiting when WAL data is not
>> available
>> from any kind of sources (local, archive or stream) before trying
>> again
>> to retrieve WAL data".
>>
>> Thought?
>
> I agree that the corrected description sound correct in meaning. The
> latter seems a bit lengthy, though.
Yeah, so better idea?
Anyway, attached is the patch (fix_recovery_wait_event_doc_v1.patch)
that fixes the descriptions of those wait events. This should be
back-patched to v9.5.
>> Also the current names of these wait events sound confusing. I think
>> that RecoveryWalAll should be changed to RecoveryWalStream.
>> RecoveryWalStream should be RecoveryRetrieveRetryInterval or
>> something.
>
> I agree to the former, I think RecoveryWalInterval works well enough.
RecoveryWalInterval sounds confusing to me...
Attached is the patch (improve_recovery_wait_event_for_master_v1.patch) that
changes the names and types of wait events. This patch uses
RecoveryRetrieveRetryInterval, but if there is better name,
I will adopt that.
Note that this patch needs to be applied after
fix_recovery_wait_event_doc_v1.patch is applied.
>> Another problem is that the current wait event types of them also look
>> strange. Currently the type of them is Activity, but IMO it's better
>> to
>> use IPC for RecoveryWalAll because it's waiting for walreceiver to
>> receive new WAL. Also it's better to use Timeout for RecoveryWalStream
>> because it's waiting depending on wal_retrieve_retry_interval.
>
> Do you mean condition variable by the "IPC"? But the WaitLatch waits
> not only for new WAL but also for trigger, SIGHUP, shutdown and
> walreceiver events other than new WAL. I'm not sure that condition
> variable fits for the purpose.
OK, I didn't change the type of RecoveryWalStream to IPC, in the patch.
Its type is still Activity.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters