Re: shared-memory based stats collector - v69 - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: shared-memory based stats collector - v69
Date
Msg-id CAKFQuwa8RZ13Sw=GMWgNysccZs3efm1t5zU-ONcGU45OkemRcQ@mail.gmail.com
Whole thread Raw
In response to Re: shared-memory based stats collector - v69  (Andres Freund <andres@anarazel.de>)
Responses Re: shared-memory based stats collector - v69  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Apr 5, 2022 at 2:23 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-04-05 13:51:12 -0700, David G. Johnston wrote:

>, but rather add to the shared queue

Queue? Maybe you mean the hashtable?

Queue implemented by a list...?  Anyway, I think I mean this:

/*
 * List of PgStat_EntryRefs with unflushed pending stats.
 *
 * Newly pending entries should only ever be added to the end of the list,
 * otherwise pgstat_flush_pending_entries() might not see them immediately.
 */
static dlist_head pgStatPending = DLIST_STATIC_INIT(pgStatPending);
 


>, and so the cache is effectively read-only.  It is also transaction-scoped
>based upon the GUC and the nature of stats vis-a-vis transactions.

No, that's not right. I think you might be thinking of
pgStatLocal.snapshot.stats?

 
Probably...
 
I guess I should add a paragraph about snapshots / fetch consistency.

I apparently confused/combined the two concepts just now so that would help.

> Even before I added the read-only and transaction-scoped I got a bit hung
> up on reading:
> "The shared hashtable only needs to be accessed when no prior reference to
> the shared hashtable exists."

> Thinking in terms of key seems to make more sense than value in this
> sentence - even if there is a one-to-one correspondence.

Maybe "prior reference to the shared hashtable exists for the key"?

I specifically dislike having two mentions of the "shared hashtable" in the same sentence, so I tried to phrase the second half in terms of the local hashtable.


> I am wondering why there are no mentions to the header files in this
> architecture, only the .c files.

Hm, I guess, but I'm not sure it'd add a lot? It's really just intended to
give a starting point (and it can't be worse than explanation of the current
system).

No need to try to come up with something.  More curious if there was a general reason to avoid it before I looked to see if I felt anything in them seemed worth including from my perspective.


> diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
> index bfbfe53deb..504f952c0e 100644
> --- a/src/backend/utils/activity/pgstat.c
> +++ b/src/backend/utils/activity/pgstat.c
> @@ -4,9 +4,9 @@
>   *
>   *
>   * PgStat_KindInfo describes the different types of statistics handled. Some
> - * kinds of statistics are collected for fixed number of objects
> - * (e.g. checkpointer statistics). Other kinds are statistics are collected
> - * for variable-numbered objects (e.g. relations).
> + * kinds of statistics are collected for a fixed number of objects
> + * (e.g., checkpointer statistics). Other kinds of statistics are collected

Was that comma after e.g. intentional?

It is.  That is the style I was taught, and that we seem to adhere to in user-facing documentation.  Source code is a mixed bag with no enforcement, but while we are here...


> + * for a varying number of objects (e.g., relations).
>   * Fixed-numbered stats are stored in plain (non-dynamic) shared memory.

status-quo works for me too, and matches up with the desired labelling we are using here.

 
>   *
> @@ -19,19 +19,21 @@
>   *
>   * All variable-numbered stats are addressed by PgStat_HashKey while running.
>   * It is not possible to have statistics for an object that cannot be
> - * addressed that way at runtime. A wider identifier can be used when
> + * addressed that way at runtime. A alternate identifier can be used when
>   * serializing to disk (used for replication slot stats).

Not sure this improves things.


It just seems odd that width is being mentioned when the actual struct is a combination of three subcomponents.  I do feel I'd need to understand exactly what replication slot stats are doing uniquely here, though, to make any point beyond that.



> - * Each statistics kind is handled in a dedicated file:
> + * Each statistics kind is handled in a dedicated file, though their structs
> + * are defined here for lack of better ideas.

-0.5


Status-quo works for me.  Food for thought for other reviewers though.

David J.

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: shared-memory based stats collector - v69
Next
From: Tom Lane
Date:
Subject: Re: Granting SET and ALTER SYSTE privileges for GUCs