Re: Add statistics to pg_stat_wal view for wal related parameter tuning - Mailing list pgsql-hackers
From | Masahiro Ikeda |
---|---|
Subject | Re: Add statistics to pg_stat_wal view for wal related parameter tuning |
Date | |
Msg-id | a75afbc238f501244855aebba728e243@oss.nttdata.com Whole thread Raw |
In response to | Re: Add statistics to pg_stat_wal view for wal related parameter tuning (Masahiro Ikeda <ikedamsh@oss.nttdata.com>) |
List | pgsql-hackers |
On 2020-12-22 11:16, Masahiro Ikeda wrote: > Thanks for your comments. > > On 2020-12-22 09:39, Andres Freund wrote: >> Hi, >> >> On 2020-12-21 13:16:50 -0800, Andres Freund wrote: >>> On 2020-12-02 13:52:43 +0900, Fujii Masao wrote: >>> > Pushed. Thanks! >>> >>> Why are wal_records/fpi long, instead of uint64? >>> long wal_records; /* # of WAL records produced */ >>> long wal_fpi; /* # of WAL full page images produced */ >>> uint64 wal_bytes; /* size of WAL records produced */ >>> >>> long is only 4 byte e.g. on windows, and it is entirely possible to >>> wrap >>> a 4 byte record counter. It's also somewhat weird that wal_bytes is >>> unsigned, but the others are signed? >>> >>> This is made doubly weird because on the SQL level you chose to make >>> wal_records, wal_fpi bigint. And wal_bytes numeric? > > I'm sorry I don't know the reason. > > The following thread is related to the patch and the type of wal_bytes > is changed from long to uint64 because XLogRecPtr is uint64. > https://www.postgresql.org/message-id/flat/20200402144438.GF64485%40nol#1f0127c98df430104c63426fdc285c20 > > I assumed that the reason why the type of wal_records/fpi is long > is BufferUsage have the members (i.e, shared_blks_hit) of the same > types. > > So, I think it's better if to change the type of wal_records/fpi from > long to uint64, > to change the types of BufferUsage's members too. I've done a little more research so I'll share the results. IUCC, theoretically this leads to caliculate the statistics less, but actually, it's not happened. The above "wal_records", "wal_fpi" are accumulation values and when WalUsageAccumDiff() is called, we can know how many wals are generated for specific executions, for example, planning/executing a query, processing a utility command, and vacuuming one relation. The following variable has accumulated "wal_records" and "wal_fpi" per process. ``` typedef struct WalUsage { long wal_records; /* # of WAL records produced */ long wal_fpi; /* # of WAL full page images produced */ uint64 wal_bytes; /* size of WAL records produced */ } WalUsage; WalUsage pgWalUsage; ``` Although this may be overflow, it doesn't affect to caliculate the difference of wal usage between some execution points. If to generate over 2 billion wal records per executions, 4 bytes is not enough and collected statistics will be lost, but I think it's not happened. In addition, "wal_records" and "wal_fpi" values sent by processes are accumulated in the statistic collector and their types are PgStat_Counter(int64). ``` typedef struct PgStat_WalStats { PgStat_Counter wal_records; PgStat_Counter wal_fpi; uint64 wal_bytes; PgStat_Counter wal_buffers_full; TimestampTz stat_reset_timestamp; } PgStat_WalStats; ``` >> Some more things: >> - There's both PgStat_MsgWal WalStats; and static PgStat_WalStats >> walStats; >> that seems *WAY* too confusing. And the former imo shouldn't be >> global. > > Sorry for the confusing name. > I referenced the following variable name. > > static PgStat_MsgSLRU SLRUStats[SLRU_NUM_ELEMENTS]; > static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS]; > > How about to change from "PgStat_MsgWal WalStats" > to "PgStat_MsgWal MsgWalStats"? > > Is it better to change the following name too? > "PgStat_MsgBgWriter BgWriterStats;" > "static PgStat_MsgSLRU SLRUStats[SLRU_NUM_ELEMENTS];" > > Since PgStat_MsgWal is called in xlog.c and pgstat.c, > I thought it's should be global. I made an attached patch to rename the above variable names. What do you think? Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Attachment
pgsql-hackers by date: