Re: shared-memory based stats collector - v69 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: shared-memory based stats collector - v69
Date
Msg-id 20220406031416.gfeowv5up22tlwzm@alap3.anarazel.de
Whole thread Raw
In response to Re: shared-memory based stats collector - v69  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: shared-memory based stats collector - v69
List pgsql-hackers
Hi,

On 2022-04-05 20:00:50 -0700, David G. Johnston wrote:
> On Tue, Apr 5, 2022 at 4:16 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2022-04-05 14:43:49 -0700, David G. Johnston wrote:
> > > On Tue, Apr 5, 2022 at 2:23 PM Andres Freund <andres@anarazel.de> wrote:
> > > > I guess I should add a paragraph about snapshots / fetch consistency.
> > > >
> > >
> > > I apparently confused/combined the two concepts just now so that would
> > help.
> >
> > Will add.

I at least tried...


> On a slightly different track, I took the time to write-up a "Purpose"
> section for pgstat.c :
> 
> It may possibly be duplicating some things written elsewhere as I didn't go
> looking for similar prior art yet, I just wanted to get thoughts down.

There's very very little prior documentation in this area.


> This is the kind of preliminary framing I've been constructing in my own
> mind as I try to absorb this patch.  I haven't formed an opinion whether
> the actual user-facing documentation should cover some or all of this
> instead of the preamble to pgstat.c (which could just point to the docs for
> prerequisite reading).

>  * The PgStat namespace defines an API that facilitates concurrent access
>  * to a shared memory region where cumulative statistical data is saved.
>  * At shutdown, one of the running system workers will initiate the writing
>  * of the data to file. Then, during startup (following a clean shutdown)
> the
>  * Postmaster process will early on ensure that the file is loaded into
> memory.

I added something roughly along those lines in the version I just sent, based
on a suggestion by Melanie over IM:

 * Statistics are loaded from the filesystem during startup (by the startup
 * process), unless preceded by a crash, in which case all stats are
 * discarded. They are written out by the checkpointer process just before
 * shutting down, except when shutting down in immediate mode.



>  * Each cumulative statistic producing system must construct a PgStat_Kind
>  * datum in this file. The details are described elsewhere, but of
>  * particular importance is that each kind is classified as having either a
>  * fixed number of objects that it tracks, or a variable number.
>  *
>  * During normal operations, the different consumers of the API will have
> their
>  * accessed managed by the API, the protocol used is determined based upon
> whether
>  * the statistical kind is fixed-numbered or variable-numbered.
>  * Readers of variable-numbered statistics will have the option to locally
>  * cache the data, while writers may have their updates locally queued
>  * and applied in a batch. Thus favoring speed over freshness.
>  * The fixed-numbered statistics are faster to process and thus forgo
>  * these mechanisms in favor of a light-weight lock.

This feels a bit jumbled. Of course something using an API will be managed by
the API. I don't know what protocol reallly means?


> Additionally, both due to unclean shutdown or user
> request,
>  * statistics can be reset - meaning that their stored numeric values are
> returned
>  * to zero, and any non-numeric data that may be tracked (say a timestamp)
> is cleared.

I think this is basically covered in the above?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: shared-memory based stats collector - v70
Next
From: "David G. Johnston"
Date:
Subject: Re: shared-memory based stats collector - v70