Re: New statistics for tuning WAL buffer size - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: New statistics for tuning WAL buffer size
Date
Msg-id b99afbcc-3665-2205-9ffd-79230125eb8c@oss.nttdata.com
Whole thread Raw
In response to Re: New statistics for tuning WAL buffer size  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
Responses Re: New statistics for tuning WAL buffer size  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
List pgsql-hackers

On 2020/09/07 9:58, Masahiro Ikeda wrote:
> Thanks for the review and advice!
> 
> On 2020-09-03 16:05, Fujii Masao wrote:
>> On 2020/09/02 18:56, Masahiro Ikeda wrote:
>>>> +/* ----------
>>>> + * Backend types
>>>> + * ----------
>>>>
>>>> You seem to forget to add "*/" into the above comment.
>>>> This issue could cause the following compiler warning.
>>>> ../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment]
>>>
>>> Thanks for the comment. I fixed.
>>
>> Thanks for the fix! But why are those comments necessary?
> 
> Sorry about that. This comment is not necessary.
> I removed it.
> 
>>> The pg_stat_walwriter is not security restricted now, so ordinary users can access it.
>>> It has the same security level as pg_stat_archiver. If you have any comments, please let me know.
>>
>> +       <structfield>dirty_writes</structfield> <type>bigint</type>
>>
>> I guess that the column name "dirty_writes" derived from
>> the DTrace probe name. Isn't this name confusing? We should
>> rename it to "wal_buffers_full" or something?
> 
> I agree and rename it to "wal_buffers_full".
> 
>> +/* ----------
>> + * PgStat_MsgWalWriter            Sent by the walwriter to update statistics.
>>
>> This comment seems not accurate because backends also send it.
>>
>> +/*
>> + * WAL writes statistics counter is updated in XLogWrite function
>> + */
>> +extern PgStat_MsgWalWriter WalWriterStats;
>>
>> This comment seems not right because the counter is not updated in XLogWrite().
> 
> Right. I fixed it to "Sent by each backend and background workers to update WAL statistics."
> In the future, other statistics will be included so I remove the function's name.
> 
> 
>> +-- There will surely and maximum one record
>> +select count(*) = 1 as ok from pg_stat_walwriter;
>>
>> What about changing this comment to "There must be only one record"?
> 
> Thanks, I fixed.
> 
>> +                    WalWriterStats.m_xlog_dirty_writes++;
>>                      LWLockRelease(WALWriteLock);
>>
>> Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected
>> with WALWriteLock, isn't it better to increment that after releasing the lock?
> 
> Thanks, I fixed.
> 
>> +CREATE VIEW pg_stat_walwriter AS
>> +    SELECT
>> +        pg_stat_get_xlog_dirty_writes() AS dirty_writes,
>> +        pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
>> +
>>  CREATE VIEW pg_stat_progress_vacuum AS
>>
>> In system_views.sql, the definition of pg_stat_walwriter should be
>> placed just after that of pg_stat_bgwriter not pg_stat_progress_analyze.
> 
> OK, I fixed it.
> 
>>      }
>> -
>>      /*
>>       * We found an existing collector stats file. Read it and put all the
>>
>> You seem to accidentally have removed the empty line here.
> 
> Sorry about that. I fixed it.
> 
>> -                 errhint("Target must be \"archiver\" or \"bgwriter\".")));
>> +                 errhint("Target must be \"archiver\" or \"bgwriter\" or
>> \"walwriter\".")));
>>
>> There are two "or" in the message, but the former should be replaced with ","?
> 
> Thanks, I fixed.
> 
> 
> On 2020-09-05 18:40, Magnus Hagander wrote:
>> On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao
>> <masao.fujii@oss.nttdata.com> wrote:
>>
>>> On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote:
>>>> From: Fujii Masao <masao.fujii@oss.nttdata.com>
>>>>>> I changed the view name from pg_stat_walwrites to
>>> pg_stat_walwriter.
>>>>>> I think it is better to match naming scheme with other views
>>> like
>>>>> pg_stat_bgwriter,
>>>>>> which is for bgwriter statistics but it has the statistics
>>> related to backend.
>>>>>
>>>>> I prefer the view name pg_stat_walwriter for the consistency with
>>>>> other view names. But we also have pg_stat_wal_receiver. Which
>>>>> makes me think that maybe pg_stat_wal_writer is better for
>>>>> the consistency. Thought? IMO either of them works for me.
>>>>> I'd like to hear more opinons about this.
>>>>
>>>> I think pg_stat_bgwriter is now a misnomer, because it contains
>>> the backends' activity.  Likewise, pg_stat_walwriter leads to
>>> misunderstanding because its information is not limited to WAL
>>> writer.
>>>>
>>>> How about simply pg_stat_wal?  In the future, we may want to
>>> include WAL reads in this view, e.g. reading undo logs in zheap.
>>>
>>> Sounds reasonable.
>>
>> +1.
>>
>> pg_stat_bgwriter has had the "wrong name" for quite some time now --
>> it became even more apparent when the checkpointer was split out to
>> it's own process, and that's not exactly a recent change. And it had
>> allocs in it from day one...
>>
>> I think naming it for what the data in it is ("wal") rather than which
>> process deals with it ("walwriter") is correct, unless the statistics
>> can be known to only *ever* affect one type of process. (And then
>> different processes can affect different columns in the view). As a
>> general rule -- and that's from what I can tell exactly what's being
>> proposed.
> 
> Thanks for your comments. I agree with your opinions.
> I changed the view name to "pg_stat_wal".
> 
> 
> I fixed the code to send the WAL statistics from not only backend and walwriter
> but also checkpointer, walsender and autovacuum worker.

Good point! Thanks for updating the patch!


@@ -604,6 +604,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
                           onerel->rd_rel->relisshared,
                           Max(new_live_tuples, 0),
                           vacrelstats->new_dead_tuples);
+    pgstat_send_wal();

I guess that you changed heap_vacuum_rel() as above so that autovacuum
workers can send WAL stats. But heap_vacuum_rel() can be called by
the processes (e.g., backends) other than autovacuum workers? Also
what happens if autovacuum workers just do ANALYZE only? In that case,
heap_vacuum_rel() may not be called.

Currently autovacuum worker reports the stats at the exit via
pgstat_beshutdown_hook(). Unlike other processes, autovacuum worker
is not the process that basically keeps running during the service. It exits
after it does vacuum or analyze. So ISTM that it's not bad to report the stats
only at the exit, in autovacuum worker case. There is no need to add extra
code for WAL stats report by autovacuum worker. Thought?


@@ -1430,6 +1430,9 @@ WalSndWaitForWal(XLogRecPtr loc)
          else
              RecentFlushPtr = GetXLogReplayRecPtr(NULL);
  
+        /* Send wal statistics */
+        pgstat_send_wal();

AFAIR logical walsender uses three loops in WalSndLoop(), WalSndWriteData()
and WalSndWaitForWal(). But could you tell me why added pgstat_send_wal()
into WalSndWaitForWal()? I'd like to know why WalSndWaitForWal() is the best
for that purpose.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Online checksums patch - once again
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey