Re: wal stats questions - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: wal stats questions
Date
Msg-id af0964ac-7080-1984-dc23-513754987716@oss.nttdata.com
Whole thread Raw
In response to Re: wal stats questions  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
Responses Re: wal stats questions  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
List pgsql-hackers

On 2021/03/30 20:37, Masahiro Ikeda wrote:
> OK, I added the condition to the fast-return check. I noticed that I
> misunderstood that the purpose is to avoid expanding a clock check using WAL
> stats counters. But, the purpose is to make the conditions stricter, right?

Yes. Currently if the following condition is false even when the WAL counters
are updated, nothing is sent to the stats collector. But with your patch,
in this case the WAL stats are sent.

    if ((pgStatTabList == NULL || pgStatTabList->tsa_used == 0) &&
        pgStatXactCommit == 0 && pgStatXactRollback == 0 &&
        !have_function_stats && !disconnect)

Thanks for the patch! It now fails to be applied to the master cleanly.
So could you rebase the patch?

pgstat_initialize() should initialize pgWalUsage with zero?

+    /*
+     * set the counters related to generated WAL data.
+     */
+    WalStats.m_wal_records = pgWalUsage.wal_records;
+    WalStats.m_wal_fpi = pgWalUsage.wal_fpi;
+    WalStats.m_wal_bytes = pgWalUsage.wal_bytes;

This should be skipped if pgWalUsage.wal_records is zero?

+ * Be careful that the counters are cleared after reporting them to
+ * the stats collector although you can use WalUsageAccumDiff()
+ * to computing differences to previous values. For backends,
+ * the counters may be reset after a transaction is finished and
+ * pgstat_send_wal() is invoked, so you can compute the difference
+ * in the same transaction only.

On the second thought, I'm afraid that this can be likely to be a foot-gun
in the future. So I'm now wondering if the current approach (i.e., calculate
the diff between the current and previous pgWalUsage and don't reset it
to zero) is better. Thought? Since the similar data structure pgBufferUsage
doesn't have this kind of restriction, I'm afraid that the developer can
easily miss that only pgWalUsage has the restriction.

But currently the diff is calculated (i.e., WalUsageAccumDiff() is called)
even when the WAL counters are not updated. Then if that calculated diff is
zero, we skip sending the WAL stats. This logic should be improved. That is,
probably we should be able to check whether the WAL counters are updated
or not, without calculating the diff, because the calculation is not free.
We can implement this by introducing new flag variable that we discussed
upthread. This flag is set to true whenever the WAL counters are incremented
and used to determine whether the WAL stats need to be sent.

If we do this, another issue is that the data types for wal_records and wal_fpi
in pgWalUsage are long. Which may lead to overflow? If yes, it's should be
replaced with uint64.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Have I found an interval arithmetic bug?
Next
From: Andres Freund
Date:
Subject: Re: Teaching users how they can get the most out of HOT in Postgres 14