Re: wal stats questions - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: wal stats questions
Date
Msg-id f1aa77f1-0fa4-3737-98c9-d7afe51babe6@oss.nttdata.com
Whole thread Raw
In response to wal stats questions  (Andres Freund <andres@anarazel.de>)
Responses Re: wal stats questions
List pgsql-hackers

On 2021/03/25 8:22, Andres Freund wrote:
> Hi,
> 
> I got a few questions about the wal stats while working on the shmem
> stats patch:

Thanks for your reviews.


> 1) What is the motivation to have both prevWalUsage and pgWalUsage,
>    instead of just accumulating the stats since the last submission?
>    There doesn't seem to be any comment explaining it? Computing
>    differences to previous values, and copying to prev*, isn't free. I
>    assume this is out of a fear that the stats could get reset before
>    they're used for instrument.c purposes - but who knows?

Is your point that it's better to call pgWalUsage = 0; after sending the
stats? My understanding is as same as your assumption. For example,
pg_stat_statements.c use pgWalUsage and calculate the diff.

But, because the stats may be sent after the transaction is finished, it
doesn't seem to lead wrong stats if pgWalUsage = 0 is called. So, I agree your
suggestion.

If the extension wants to know the walusage diff across two transactions,
it may lead to wrong stats, but I think it won't happen.


> 2) Why is there both pgstat_send_wal() and pgstat_report_wal()? With the
>    former being used by wal writer, the latter by most other processes?
>    There again don't seem to be comments explaining this.

To control the transmission interval for the wal writer because it may send
the stats too frequency, and to omit to calculate the generated wal stats
because it doesn't generate wal records. But, now I think it's better to merge
them.



> 3) Doing if (memcmp(&WalStats, &all_zeroes, sizeof(PgStat_MsgWal)) == 0)
>    just to figure out if there's been any changes isn't all that
>    cheap. This is regularly exercised in read-only workloads too. Seems
>    adding a boolean WalStats.have_pending = true or such would be
>    better.

I understood that for backends, this may leads performance degradation and
this problem is not only for the WalStats but also SLRUStats.

I wondered the degradation is so big because pgstat_report_stat() checks if at
least PGSTAT_STAT_INTERVAL msec is passed since it last sent before check the
diff. If my understanding is correct, to get timestamp is more expensive.



> 4) For plain backends pgstat_report_wal() is called by
>    pgstat_report_stat() - but it is not checked as part of the "Don't
>    expend a clock check if nothing to do" check at the top.  It's
>    probably rare to have pending wal stats without also passing one of
>    the other conditions, but ...

(I'm not confidence my understanding of your comment is right.)
You mean that we need to expend a clock check in pgstat_report_wal()?
I think it's unnecessary because pgstat_report_stat() is already checked it.



> Generally the various patches seems to to have a lot of the boilerplate
> style comments (like "Prepare and send the message"), but very little in
> the way of explaining the design.

Sorry for that. I'll be careful.


Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Nicer error when connecting to standby with hot_standby=off
Next
From: Amit Langote
Date:
Subject: Re: making update/delete of inheritance trees scale better