Re: Add statistics to pg_stat_wal view for wal related parameter tuning - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: Add statistics to pg_stat_wal view for wal related parameter tuning
Date
Msg-id 06bc2d405b086a72fb14657de9befae8@oss.nttdata.com
Whole thread Raw
In response to Re: Add statistics to pg_stat_wal view for wal related parameter tuning  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Add statistics to pg_stat_wal view for wal related parameter tuning
Re: Add statistics to pg_stat_wal view for wal related parameter tuning
List pgsql-hackers
On 2020-10-20 12:46, Amit Kapila wrote:
> On Tue, Oct 20, 2020 at 8:01 AM Masahiro Ikeda 
> <ikedamsh@oss.nttdata.com> wrote:
>> 
>> Hi,
>> 
>> I think we need to add some statistics to pg_stat_wal view.
>> 
>> Although there are some parameter related WAL,
>> there are few statistics for tuning them.
>> 
>> I think it's better to provide the following statistics.
>> Please let me know your comments.
>> 
>> ```
>> postgres=# SELECT * from pg_stat_wal;
>> -[ RECORD 1 ]-------+------------------------------
>> wal_records         | 2000224
>> wal_fpi             | 47
>> wal_bytes           | 248216337
>> wal_buffers_full    | 20954
>> wal_init_file       | 8
>> wal_write_backend   | 20960
>> wal_write_walwriter | 46
>> wal_write_time      | 51
>> wal_sync_backend    | 7
>> wal_sync_walwriter  | 8
>> wal_sync_time       | 0
>> stats_reset         | 2020-10-20 11:04:51.307771+09
>> ```
>> 
>> 1. Basic statistics of WAL activity
>> 
>> - wal_records: Total number of WAL records generated
>> - wal_fpi: Total number of WAL full page images generated
>> - wal_bytes: Total amount of WAL bytes generated
>> 
>> To understand DB's performance, first, we will check the performance
>> trends for the entire database instance.
>> For example, if the number of wal_fpi becomes higher, users may tune
>> "wal_compression", "checkpoint_timeout" and so on.
>> 
>> Although users can check the above statistics via EXPLAIN, 
>> auto_explain,
>> autovacuum and pg_stat_statements now,
>> if users want to see the performance trends  for the entire database,
>> they must recalculate the statistics.
>> 
> 
> Here, do you mean to say 'entire cluster' instead of 'entire database'
> because it seems these stats are getting collected for the entire
> cluster?

Thanks for your comments.
Yes, I wanted to say 'entire cluster'.

>> I think it is useful to add the sum of the basic statistics.
>> 
> 
> There is an argument that it is better to view these stats at the
> statement-level so that one can know which statements are causing most
> WAL and then try to rate-limit them if required in the application and
> anyway they can get the aggregate of all the WAL if they want. We have
> added these stats in PG-13, so do we have any evidence that the
> already added stats don't provide enough information? I understand
> that you are trying to display the accumulated stats here which if
> required users/DBA need to compute with the currently provided stats.
> OTOH, sometimes adding more ways to do some things causes difficulty
> for users to understand and learn.

I agreed that the statement-level stat is important and I understood 
that we can
know the aggregated WAL stats of pg_stat_statement view and autovacuum's 
log.
But now, WAL stats generated by autovacuum can be output to logs and it 
is not
easy to aggregate them. Since WAL writes impacts for the entire cluster, 
I thought
it's natural to provide accumulated value.

> I see that we also need to add extra code to capture these stats (some
> of which is in performance-critical path especially in
> XLogInsertRecord) which again makes me a bit uncomfortable. It might
> be that it is all fine as it is very important to collect these stats
> at cluster-level in spite that the same information can be gathered at
> statement-level to help customers but I don't see a very strong case
> for that in your proposal.

Also about performance, I thought there are few impacts because it
increments stats in memory. If I can implement to reuse pgWalUsage's
value which already collects these stats, there is no impact in 
XLogInsertRecord.
For example, how about pg_stat_wal() calculates the accumulated
value of wal_records, wal_fpi, and wal_bytes to use pgWalUsage's value?

Regards
-- 
Masahiro Ikeda
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2