Re: About to add WAL write/fsync statistics to pg_stat_wal view - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date
Msg-id CAKFQuwa7xDBXwk2nr2=+_f+=2YcwUaXC2eU7eCpokipCEBJe7A@mail.gmail.com
Whole thread Raw
In response to Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
List pgsql-hackers
On Mon, Jan 25, 2021 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jan 25, 2021 at 4:51 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote:
>
> Hi, thanks for the reviews.
>
> I updated the attached patch.

Thank you for updating the patch!
 
Your original email with "total number of times" is more correct, removing the "of times" and just writing "total number of WAL" is not good wording.

Specifically, this change is strictly worse than the original.

-       Number of times WAL data was written to disk because WAL buffers became full
+       Total number of WAL data written to disk because WAL buffers became full

Both have the flaw that they leave implied exactly what it means to "write WAL to disk".  It is also unclear whether a counter, bytes, or both, would be more useful here. I've incorporated this into my documentation suggestions below:

(wal_buffers_full)
-- Revert - the original was better, though maybe add more detail similar to the below.  I didn't research exactly how this works.

(wal_write)
The number of times WAL buffers were written out to disk via XLogWrite

-- Seems like this should have a bytes version too

(wal_write_time)
The amount of time spent writing WAL buffers to disk, excluding sync time unless the wal_sync_method is either open_datasync or open_sync.
Units are in milliseconds with microsecond resolution.  This is zero when track_wal_io_timing is disabled.

(wal_sync)
The number of times WAL files were synced to disk while wal_sync_method was set to one of the "sync at commit" options (i.e., fdatasync, fsync, or fsync_writethrough).

-- it is not going to be zero just because those settings are presently disabled as they could have been enabled at some point since the last time these statistics were reset.

(wal_sync_time)
The amount of time spent syncing WAL files to disk, in milliseconds with microsecond resolution.  This requires setting wal_sync_method to one of the "sync at commit" options (i.e., fdatasync, fsync, or fsync_writethrough).


Also,

I would suggest extracting the changes to postmaster/pgstat.c and replication/walreceiver.c to a separate patch as you've fundamentally changed how it behaves with regards to that function and how it interacts with the WAL receiver.  That seems an entirely separate topic warranting its own patch and discussion.

David J.

pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: About to add WAL write/fsync statistics to pg_stat_wal view
Next
From: "David G. Johnston"
Date:
Subject: Re: About to add WAL write/fsync statistics to pg_stat_wal view