Re: shared-memory based stats collector - v68 - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: shared-memory based stats collector - v68
Date
Msg-id CAKFQuwajw6Za+j8tj9Kzz1NR+wUw=tO3G=8xLkGD79Mh6VuNMQ@mail.gmail.com
Whole thread Raw
In response to Re: shared-memory based stats collector - v66  (Andres Freund <andres@anarazel.de>)
Responses Re: shared-memory based stats collector - v68
List pgsql-hackers

On Sun, Apr 3, 2022 at 9:16 PM Andres Freund <andres@anarazel.de> wrote:

Please take a look!


I didn't take the time to fixup all the various odd typos in the general code comments; none of them reduced comprehension appreciably.  I may do so when/if I do another pass.

I did skim over the entire patch set and, FWIW, found it to be quite understandable.  I don't have the experience to comment on the lower-level details like locking and such but the medium picture stuff makes sense to me both as a user and a developer.  I did leave a couple of comments about parts that at least piqued my interest (reset single stats) or seemed like an undesirable restriction that was under addressed (before server shutdown called exactly once).

I agree with Thomas's observation regarding PGSTAT_KIND_LAST.  I also think that leaving it starting at 1 makes sense - maybe just fix the name and comment to better reflect its actual usage in core.

I concur also with changing usages of " / " to ", or"

My first encounter with pg_stat_exists_stat() didn't draw my attention as being problematic so I'd say we just stick with it.  As a SQL user reading: WHERE exists (...) is somewhat natural; using "have" or back-to-back stat_stat is less appealing.

I would suggest we do away with stats_fetch_consistency "snapshot" mode and instead add a function that can be called that would accomplish the same thing but in "cache" mode.  Future iterations of that function could accept patterns, allowing for something between "one" and "everything".

I'm also not an immediate fan of "fetch_consistency"; with the function suggestion it is basically "cache" and "no-cache" so maybe: stats_use_transaction_cache ? (haven't thought hard or long on this one...)


 diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 22d0a1e491..e889c11d9e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2123,7 +2123,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      </row>
      <row>
       <entry><literal>PgStatsData</literal></entry>
-      <entry>Waiting fo shared memory stats data access</entry>
+      <entry>Waiting for shared memory stats data access</entry>
      </row>
      <row>
       <entry><literal>SerializableXactHash</literal></entry>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2689d0962c..bc7bdf8064 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4469,7 +4469,7 @@ PostgresMain(const char *dbname, const char *username)
 
  /*
  * (4) turn off the idle-in-transaction, idle-session and
- * idle-state-update timeouts if active.  We do this before step (5) so
+ * idle-stats-update timeouts if active.  We do this before step (5) so
  * that any last-moment timeout is certain to be detected in step (5).
  *
  * At most one of these timeouts will be active, so there's no need to
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index dbd55a065d..370638b33b 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -5,7 +5,7 @@
  * Provides the infrastructure to collect and access cumulative statistics,
  * e.g. per-table access statistics, of all backends in shared memory.
  *
- * Most statistics updates are first first accumulated locally in each process
+ * Most statistics updates are first accumulated locally in each process
  * as pending entries, then later flushed to shared memory (just after commit,
  * or by idle-timeout).
  *
@@ -371,7 +371,9 @@ pgstat_discard_stats(void)
 /*
  * pgstat_before_server_shutdown() needs to be called by exactly one process
  * during regular server shutdowns. Otherwise all stats will be lost.
- *
+ * XXX: What bad things happen if this is invoked by more than one process?
+ *   I'd presume stats are not actually lost in that case.  Can we just 'no-op'
+ *   subsequent calls and say "at least once at shutdown, as late as possible"
  * We currently only write out stats for proc_exit(0). We might want to change
  * that at some point... But right now pgstat_discard_stats() would be called
  * during the start after a disorderly shutdown, anyway.
@@ -654,6 +656,14 @@ pgstat_reset_single_counter(PgStat_Kind kind, Oid objoid)
 
  Assert(!pgstat_kind_info_for(kind)->fixed_amount);
 
+ /*
+ * More of a conceptual observation here - the fact that something is fixed does not imply
+ * that it is not fixed at a value greater than zero and thus could have single subentries
+ * that could be addressed.
+ * I also am unsure, off the top of my head, whether both replication slots and subscriptions,
+ * which are fixed, can be reset singly (today, and/or whether this patch enables that capability)
+ */
+
  /* Set the reset timestamp for the whole database */
  pgstat_reset_database_timestamp(MyDatabaseId, ts);
 

David J.

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Mingw task for Cirrus CI
Next
From: Tom Lane
Date:
Subject: Re: New Object Access Type hooks