Re: wal stats questions - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: wal stats questions
Date
Msg-id cc0481fc-e6c2-1d25-17d6-1492ac4cfe08@oss.nttdata.com
Whole thread Raw
In response to Re: wal stats questions  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: wal stats questions
List pgsql-hackers

On 2021/05/17 22:34, Fujii Masao wrote:
> 
> 
> On 2021/05/17 16:39, Masahiro Ikeda wrote:
>>
>> Thanks for your comments!
>>
>>>> +     * is executed, wal records aren't nomally generated (although HOT makes
>>>
>>> nomally -> normally?
>>
>> Yes, fixed.
>>
>>>> +     * It's not enough to check the number of generated wal records, for
>>>> +     * example the walwriter may write/sync the WAL although it doesn't
>>>
>>> You use both 'wal' and 'WAL' in the comments, but are there any intension?
>>
>> No, I changed to use 'WAL' only. Although some other comments have 'wal' and
>> 'WAL', it seems that 'WAL' is often used.
> 
> Thanks for updating the patch!

Thanks a lot of comments!

> + * Buffer and generated WAL usage counters.
> + *
> + * The counters are accumulated values. There are infrastructures
> + * to add the values and calculate the difference within a specific period.
> 
> Is it really worth adding these comments here?

BufferUsage has almost same comments. So, I removed it.

> +     * Note: regarding to WAL statistics counters, it's not enough to check
> +     * only whether the WAL record is generated or not. If a read transaction
> +     * is executed, WAL records aren't normally generated (although HOT makes
> +     * WAL records). But, just writes and syncs the WAL data may happen when
> +     * to flush buffers.
> 
> Aren't the following comments better?
> 
> ------------------------------
> To determine whether any WAL activity has occurred since last time, not only
> the number of generated WAL records but also the numbers of WAL writes and
> syncs need to be checked. Because even transaction that generates no WAL
> records can write or sync WAL data when flushing the data pages.
> ------------------------------

Thanks. Yes, I fixed it.

> -     * This function can be called even if nothing at all has happened. In
> -     * this case, avoid sending a completely empty message to the stats
> -     * collector.
> 
> I think that it's better to leave this comment as it is.

OK. I leave it.

> +     * First, to check the WAL stats counters were updated.
> +     *
> +     * Even if the 'force' is true, we don't need to send the stats if the
> +     * counters were not updated.
> +     *
> +     * We can know whether the counters were updated or not to check only
> +     * three counters. So, for performance, we don't allocate another memory
> +     * spaces and check the all stats like pgstat_send_slru().
> 
> Is it really worth adding these comments here?

I removed them because the following comments are enough.

* This function can be called even if nothing at all has happened. In
* this case, avoid sending a completely empty message to the stats
* collector.

> +     * The current 'wal_records' is the same as the previous one means that
> +     * WAL records weren't generated. So, the counters of 'wal_fpi',
> +     * 'wal_bytes', 'm_wal_buffers_full' are not updated neither.
> +     *
> +     * It's not enough to check the number of generated WAL records, for
> +     * example the walwriter may write/sync the WAL although it doesn't
> +     * generate WAL records. 'm_wal_write' and 'm_wal_sync' are zero means the
> +     * counters of time spent are zero too.
> 
> Aren't the following comments better?
> 
> ------------------------
> Check wal_records counter to determine whether any WAL activity has happened
> since last time. Note that other WalUsage counters don't need to be checked
> because they are incremented always together with wal_records counter.
> 
> m_wal_buffers_full also doesn't need to be checked because it's incremented
> only when at least one WAL record is generated (i.e., wal_records counter is
> incremented). But for safely, we assert that m_wal_buffers_full is always zero
> when no WAL record is generated
> 
> This function can be called by a process like walwriter that normally
> generates no WAL records. To determine whether any WAL activity has happened
> at that process since the last time, the numbers of WAL writes and syncs are
> also checked.
> ------------------------

Yes, I modified them.

> + * The accumulated counters for buffer usage.
> + *
> + * The reason the counters are accumulated values is to avoid unexpected
> + * reset because many callers may use them.
> 
> Aren't the following comments better?
> 
> ------------------------
> These counters keep being incremented infinitely, i.e., must never be reset to
> zero, so that we can calculate how much the counters are incremented in an
> arbitrary period.
> ------------------------

Yes, thanks.
I fixed it.

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION

Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: "ERROR: deadlock detected" when replicating TRUNCATE
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.