Thread: pgsql: pgstat: add/extend tests for resetting various kinds of stats.

pgsql: pgstat: add/extend tests for resetting various kinds of stats.

From
Andres Freund
Date:
pgstat: add/extend tests for resetting various kinds of stats.

- subscriber stats reset path was untested
- slot stat sreset path for all slots was untested
- pg_stat_database.sessions etc was untested
- pg_stat_reset_shared() was untested, for any kind of shared stats
- pg_stat_reset() was untested

Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/5264add7847871d61d36a5770dac2139d6a7bc80

Modified Files
--------------
contrib/test_decoding/expected/stats.out    | 117 +++++++---
contrib/test_decoding/sql/stats.sql         |  32 ++-
src/test/recovery/t/006_logical_decoding.pl |  63 ++++++
src/test/regress/expected/stats.out         | 164 ++++++++++++++
src/test/regress/parallel_schedule          |   3 +
src/test/regress/sql/stats.sql              |  78 +++++++
src/test/subscription/t/026_stats.pl        | 318 +++++++++++++++++++++-------
7 files changed, 664 insertions(+), 111 deletions(-)


Re: pgsql: pgstat: add/extend tests for resetting various kinds of stats.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> pgstat: add/extend tests for resetting various kinds of stats.

crake just failed [1] with what appears to be a race condition in
a test case added by this commit:

diff -U3 /home/andrew/bf/root/HEAD/pgsql/src/test/regress/expected/stats.out
/home/andrew/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/results/stats.out
--- /home/andrew/bf/root/HEAD/pgsql/src/test/regress/expected/stats.out    2022-07-18 13:32:25.155847949 -0400
+++ /home/andrew/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/results/stats.out    2022-09-15
12:44:10.022975164-0400 
@@ -563,7 +563,7 @@
 SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database());
  ?column?
 ----------
- t
+ f
 (1 row)

 -- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal

It seems likely to me that the n_sessions increment hasn't made it to
shared memory because some background process happened to have a lock on
the stats entry when we tried.  Don't we need a pg_stat_force_next_flush()
call before trying to inspect the sessions count?

BTW, the header comment for pgstat_report_stat is badly in need of
copy-editing.

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2022-09-15%2016%3A18%3A30



Re: pgsql: pgstat: add/extend tests for resetting various kinds of stats.

From
Andres Freund
Date:
Hi,

On 2022-09-15 15:59:52 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > pgstat: add/extend tests for resetting various kinds of stats.
> 
> crake just failed [1] with what appears to be a race condition in
> a test case added by this commit:
> 
> diff -U3 /home/andrew/bf/root/HEAD/pgsql/src/test/regress/expected/stats.out
/home/andrew/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/results/stats.out
> --- /home/andrew/bf/root/HEAD/pgsql/src/test/regress/expected/stats.out    2022-07-18 13:32:25.155847949 -0400
> +++ /home/andrew/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/results/stats.out    2022-09-15
12:44:10.022975164-0400
 
> @@ -563,7 +563,7 @@
>  SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database());
>   ?column? 
>  ----------
> - t
> + f
>  (1 row)
>  
>  -- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal
> 
> It seems likely to me that the n_sessions increment hasn't made it to shared
> memory because some background process happened to have a lock on the stats
> entry when we tried.  Don't we need a pg_stat_force_next_flush() call before
> trying to inspect the sessions count?

Indeed. Pushed a fix. To see if there are other omissions, I made
pgstat_report_stat() never flush stats unless force is passed in. No other
failures.


> BTW, the header comment for pgstat_report_stat is badly in need of
> copy-editing.

Hm - you're talking about this sentence?

   ... Callers can use the returned time to set up
 * a timeout after which to call pgstat_report_stat(true), but are not
 * required to to do so.

Reads awkward. I'm struggling with the english language trying to improve it
:(.

"Callers can, but are not required to, use the returned time to set up a
timer, whose expiration should trigger a call to pgstat_report_stat(true)."?

Greetings,

Andres Freund



Re: pgsql: pgstat: add/extend tests for resetting various kinds of stats.

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-09-15 15:59:52 -0400, Tom Lane wrote:
>> It seems likely to me that the n_sessions increment hasn't made it to shared
>> memory because some background process happened to have a lock on the stats
>> entry when we tried.  Don't we need a pg_stat_force_next_flush() call before
>> trying to inspect the sessions count?

> Indeed. Pushed a fix. To see if there are other omissions, I made
> pgstat_report_stat() never flush stats unless force is passed in. No other
> failures.

Cool, thanks for dealing with that.

>> BTW, the header comment for pgstat_report_stat is badly in need of
>> copy-editing.

> Hm - you're talking about this sentence?

/*
 * Must be called by processes that performs DML: tcop/postgres.c, logical
                     ^^^^^^^^^^^^^^^^^^^^^^^

Plural agreement problem.  "all processes that perform" or "each process
that performs" would be OK.

 * receiver processes, SPI worker, etc. to flush pending statistics updates to
 * shared memory.
 *
 * Unless called with 'force', pending stats updates are flushed happen once
                                                     ^^^^^^^^^^^^^^^^^^

One verb too many.

 * per PGSTAT_MIN_INTERVAL (1000ms). When not forced, stats flushes do not
 * block on lock acquisition, except if stats updates have been pending for
 * longer than PGSTAT_MAX_INTERVAL (60000ms).
 *
 * Whenever pending stats updates remain at the end of pgstat_report_stat() a
 * suggested idle timeout is returned. Currently this is always
 * PGSTAT_IDLE_INTERVAL (10000ms). Callers can use the returned time to set up
 * a timeout after which to call pgstat_report_stat(true), but are not
 * required to to do so.

This isn't grammatically bad, but it fails to explain how callers are to
know whether they need to do that.  It looks like the convention is to
return 0 if no updates remain, but you have to read the code and guess.

 *
 * Note that this is called only when not within a transaction, so it is fair
 * to use transaction stop time as an approximation of current time.
 */

This should be inside the function, it's not part of the API spec nor of
any concern to callers.  Or maybe the API spec should say "This must be
called immediately after finishing a transaction", and then inside the
function say "We assume GetCurrentTransactionStopTimestamp gives a
close-enough approximation of current time."

            regards, tom lane



Re: pgsql: pgstat: add/extend tests for resetting various kinds of stats.

From
Tom Lane
Date:
I wrote:
>>  * required to to do so.

> This isn't grammatically bad,

... um, other than the two "to"s.  But the semantic omission is a bigger
deal.

            regards, tom lane