Re: About to add WAL write/fsync statistics to pg_stat_wal view - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date
Msg-id 6d375c1c-57a5-bd0e-f85b-0747b81ffd5d@oss.nttdata.com
Whole thread Raw
In response to Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: About to add WAL write/fsync statistics to pg_stat_wal view
List pgsql-hackers

On 2021/02/08 13:01, Fujii Masao wrote:
> 
> 
> On 2021/02/05 8:45, Masahiro Ikeda wrote:
>> I pgindented the patches.
> 
> Thanks for updating the patches!
> 
> +       <function>XLogWrite</function>, which nomally called by an
> +       <function>issue_xlog_fsync</function>, which nomally called by an
> 
> Typo: "nomally" should be "normally"?
> 
> +       <function>XLogFlush</function> request(see <xref linkend="wal-configuration"/>)
> +       <function>XLogFlush</function> request(see <xref linkend="wal-configuration"/>),
> 
> Isn't it better to add a space character just after "request"?
> 
> +                    INSTR_TIME_SET_CURRENT(duration);
> +                    INSTR_TIME_SUBTRACT(duration, start);
> +                    WalStats.m_wal_write_time = INSTR_TIME_GET_MICROSEC(duration);
> 
> If several cycles happen in the do-while loop, m_wal_write_time should be
> updated with the sum of "duration" in those cycles instead of "duration"
> in the last cycle? If yes, "+=" should be used instead of "=" when updating
> m_wal_write_time?
> 
> +            INSTR_TIME_SET_CURRENT(duration);
> +            INSTR_TIME_SUBTRACT(duration, start);
> +            WalStats.m_wal_sync_time = INSTR_TIME_GET_MICROSEC(duration);
> 
> Also "=" should be "+=" in the above?

+        /* Send WAL statistics */
+        pgstat_send_wal();

This may cause overhead in WAL-writing by walwriter because it's called
every cycles even when walwriter needs to write more WAL next cycle
(don't need to sleep on WaitLatch)? If this is right, pgstat_send_wal()
should be called only when WaitLatch() returns with WL_TIMEOUT?

-       <function>XLogFlush</function> request(see <xref linkend="wal-configuration"/>)
+       <function>XLogFlush</function> request(see <xref linkend="wal-configuration"/>),
+       or WAL data written out to disk by WAL receiver.

So regarding walreceiver, only wal_write, wal_write_time, wal_sync, and
wal_sync_time are updated even while the other values are not. Isn't this
confusing to users? If so, what about reporting those walreceiver stats in
pg_stat_wal_receiver?

                  if (endofwal)
+                {
+                    /* Send WAL statistics to the stats collector */
+                    pgstat_send_wal();
                      break;

You added pgstat_send_wal() so that it's called in some cases where
walreceiver exits. But ISTM that there are other walreceiver-exit cases.
For example, in the case where SIGTERM is received. Instead,
pgstat_send_wal() should be called in WalRcvDie() for those all cases?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: Is Recovery actually paused?
Next
From: Amit Kapila
Date:
Subject: Re: pg_replication_origin_drop API potential race condition