Re: About to add WAL write/fsync statistics to pg_stat_wal view - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date
Msg-id 8555ab589d74ec2df61237209f4f21f2@oss.nttdata.com
Whole thread Raw
In response to Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: About to add WAL write/fsync statistics to pg_stat_wal view  (japin <japinli@hotmail.com>)
Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On 2021-01-22 14:50, Masahiko Sawada wrote:
> On Fri, Dec 25, 2020 at 6:46 PM Masahiro Ikeda 
> <ikedamsh@oss.nttdata.com> wrote:
>> 
>> Hi,
>> 
>> I rebased the patch to the master branch.
> 
> Thank you for working on this. I've read the latest patch. Here are 
> comments:
> 
> ---
> +               if (track_wal_io_timing)
> +               {
> +                   INSTR_TIME_SET_CURRENT(duration);
> +                   INSTR_TIME_SUBTRACT(duration, start);
> +                   WalStats.m_wal_write_time +=
> INSTR_TIME_GET_MILLISEC(duration);
> +               }
> 
> * I think it should add the time in micro sec.
> After running pgbench with track_wal_io_timing = on for 30 sec,
> pg_stat_wal showed the following on my environment:
> 
> postgres(1:61569)=# select * from pg_stat_wal;
> -[ RECORD 1 ]----+-----------------------------
> wal_records      | 285947
> wal_fpi          | 53285
> wal_bytes        | 442008213
> wal_buffers_full | 0
> wal_write        | 25516
> wal_write_time   | 0
> wal_sync         | 25437
> wal_sync_time    | 14490
> stats_reset      | 2021-01-22 10:56:13.29464+09
> 
> Since writes can complete less than a millisecond, wal_write_time
> didn't increase. I think sync_time could also have the same problem.

Thanks for your comments. I didn't notice that.
I changed the unit from milliseconds to microseconds.

> ---
> +   /*
> +    * Measure i/o timing to fsync WAL data.
> +    *
> +    * The wal receiver skip to collect it to avoid performance
> degradation of standy servers.
> +    * If sync_method doesn't have its fsync method, to skip too.
> +    */
> +   if (!AmWalReceiverProcess() && track_wal_io_timing && 
> fsyncMethodCalled())
> +       INSTR_TIME_SET_CURRENT(start);
> 
> * Why does only the wal receiver skip it even if track_wal_io_timinig
> is true? I think the performance degradation is also true for backend
> processes. If there is another reason for that, I think it's better to
> mention in both the doc and comment.
> * How about checking track_wal_io_timing first?
> * s/standy/standby/

I fixed it.
As kuroda-san mentioned too, the skip is no need to be considered.

> ---
> +   /* increment the i/o timing and the number of times to fsync WAL 
> data */
> +   if (fsyncMethodCalled())
> +   {
> +       if (!AmWalReceiverProcess() && track_wal_io_timing)
> +       {
> +           INSTR_TIME_SET_CURRENT(duration);
> +           INSTR_TIME_SUBTRACT(duration, start);
> +           WalStats.m_wal_sync_time += 
> INSTR_TIME_GET_MILLISEC(duration);
> +       }
> +
> +       WalStats.m_wal_sync++;
> +   }
> 
> * I'd avoid always calling fsyncMethodCalled() in this path. How about
> incrementing m_wal_sync after each sync operation?

I think if syncing the disk does not occur, m_wal_sync should not be 
incremented.
It depends enableFsync and sync_method.

enableFsync is checked in each fsync method like 
pg_fsync_no_writethrough(),
so if incrementing m_wal_sync after each sync operation, it should be 
implemented
in each fsync method. It leads to many duplicated codes.

So, why don't you change the function to a flag whether to
sync data to the disk will be occurred or not in issue_xlog_fsync()?


> ---
> +/*
> + * Check if fsync mothod is called.
> + */
> +static bool
> +fsyncMethodCalled()
> +{
> +   if (!enableFsync)
> +       return false;
> +
> +   switch (sync_method)
> +   {
> +       case SYNC_METHOD_FSYNC:
> +       case SYNC_METHOD_FSYNC_WRITETHROUGH:
> +       case SYNC_METHOD_FDATASYNC:
> +           return true;
> +       default:
> +           /* others don't have a specific fsync method */
> +           return false;
> +   }
> +}
> 
> * I'm concerned that the function name could confuse the reader
> because it's called even before the fsync method is called. As I
> commented above, calling to fsyncMethodCalled() can be eliminated.
> That way, this function is called at only once. So do we really need
> this function?

Thanks to your comments, I removed them.


> * As far as I read the code, issue_xlog_fsync() seems to do fsync even
> if enableFsync is false. Why does the function return false in that
> case? I might be missing something.

IIUC, the reason is that I thought that each fsync functions like 
pg_fsync_no_writethrough() check enableFsync.

If this code doesn't check, m_wal_sync_time may be incremented
even though some sync methods like SYNC_METHOD_OPEN don't call to sync 
some data to the disk at the time.

> * void is missing as argument?
> 
> * s/mothod/method/

I removed them.


Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION
Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: doc review for v14
Next
From: Tomas Vondra
Date:
Subject: Re: PoC/WIP: Extended statistics on expressions