Re: shared-memory based stats collector - v69 - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: shared-memory based stats collector - v69
Date
Msg-id CAKFQuwbhE8nG5mxz96tzc-waR5NWMPppnG92au+kU4wESQuXNw@mail.gmail.com
Whole thread Raw
In response to Re: shared-memory based stats collector - v69  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Apr 5, 2022 at 8:14 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-04-05 20:00:50 -0700, David G. Johnston wrote:

 * Statistics are loaded from the filesystem during startup (by the startup
 * process), unless preceded by a crash, in which case all stats are
 * discarded. They are written out by the checkpointer process just before
 * shutting down, except when shutting down in immediate mode.


Cool.  I was on the fence about the level of detail here, but mostly excluded mentioning the checkpointer 'cause I didn't want to research the correct answer tonight.

>  * Each cumulative statistic producing system must construct a PgStat_Kind
>  * datum in this file. The details are described elsewhere, but of
>  * particular importance is that each kind is classified as having either a
>  * fixed number of objects that it tracks, or a variable number.
>  *
>  * During normal operations, the different consumers of the API will have
> their
>  * accessed managed by the API, the protocol used is determined based upon
> whether
>  * the statistical kind is fixed-numbered or variable-numbered.
>  * Readers of variable-numbered statistics will have the option to locally
>  * cache the data, while writers may have their updates locally queued
>  * and applied in a batch. Thus favoring speed over freshness.
>  * The fixed-numbered statistics are faster to process and thus forgo
>  * these mechanisms in favor of a light-weight lock.

This feels a bit jumbled.

I had that inkling as well.  First draft and I needed to stop at some point.  It didn't seem bad or wrong at least.

Of course something using an API will be managed by
the API. I don't know what protocol reallly means?


Procedure, process, algorithm are synonyms.  Procedure probably makes more sense here since it is a procedural language we are using.  I thought of algorithm while writing this but it carried too much technical baggage for me (compression, encryption, etc..) that this didn't seem to fit in with.

> Additionally, both due to unclean shutdown or user
> request,
>  * statistics can be reset - meaning that their stored numeric values are
> returned
>  * to zero, and any non-numeric data that may be tracked (say a timestamp)
> is cleared.

I think this is basically covered in the above?


Yes and no.  The first paragraph says they are forced to reset due to system error.  This paragraph basically says that resetting this kind of statistic is an acceptable, and even expected, thing to do.  And in fact can also be done intentionally and not only due to system error.  I am pondering whether to mention this dynamic first and/or better blend it in - but the minor repetition in the different contexts seems ok.

David J.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SQL/JSON: JSON_TABLE
Next
From: Amit Langote
Date:
Subject: Re: Skip partition tuple routing with constant partition key