Re: shared-memory based stats collector - v68 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: shared-memory based stats collector - v68
Date
Msg-id 20220404215414.ygxtq3p5ds2tedga@alap3.anarazel.de
Whole thread Raw
In response to Re: shared-memory based stats collector - v68  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: shared-memory based stats collector - v68  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
Hi,

On 2022-04-04 14:25:57 -0700, David G. Johnston wrote:
> > You mentioned this as a restriction above - I'm not seeing it as such?  I'd
> > like to write out stats more often in the future (e.g. in the
> > checkpointer),
> > but then it'd not be written out with this function...
> >
> >
> Yeah, the idea only really works if you can implement "last one out, shut
> off the lights".  I think I was subconsciously wanting this to work that
> way, but the existing process is good.

Preserving stats more than we do today (the patch doesn't really affect that)
will require a good chunk more work. My idea for it is that we'd write the
file out as part of a checkpoint / restartpoint, with a name including the
redo-lsn. Then when recovery starts, it can use the stats file associated with
that to start from.  Then we'd loose at most 1 checkpoint's worth of stats
during a crash, not more.

There's a few non-trivial corner cases to solve, around stats objects getting
dropped concurrently with creating that serialized snapshot. Solvable, but not
trivial.


> > > + * I also am unsure, off the top of my head, whether both replication
> > > slots and subscriptions,
> > > + * which are fixed, can be reset singly (today, and/or whether this
> > patch
> > > enables that capability)
> > > + */
> >
> > FWIW, neither are implemented as fixed amount stats.
> 
> 
> That was a typo, I meant to write variable.  My point was that of these 5
> kinds that will pass the assertion test only 2 of them are actually handled
> by the function today.
> 
> + PGSTAT_KIND_DATABASE = 1, /* database-wide statistics */
> + PGSTAT_KIND_RELATION, /* per-table statistics */
> + PGSTAT_KIND_FUNCTION, /* per-function statistics */
> + PGSTAT_KIND_REPLSLOT, /* per-slot statistics */
> + PGSTAT_KIND_SUBSCRIPTION, /* per-subscription statistics */

> As the existing function only handles functions and relations why not just
> perform a specific Kind check for them?  Generalizing to assert on whether
> or not the function works on fixed or variable Kinds seems beyond its
> present state.  Or could it be used, as-is, for databases, replication
> slots, and subscriptions today, and we just haven't migrated those areas to
> use the now generalized function?

It couldn't quite be used for those, because it really only makes sense for
objects "within a database", because it wants to reset the timestamp of the
pg_stat_database row too (I don't like that behaviour as-is, but that's the
topic of another thread as you know...).

It will work for other per-database stats though, once we have them.


> Even then, unless we do expand the
> definition of the this publicly facing function is seems better to
> precisely define what it requires as an input Kind by checking for RELATION
> or FUNCTION specifically.

I don't see a benefit in adding a restriction on it that we'd just have to
lift again?

How about adding a
Assert(!pgstat_kind_info_for(kind)->accessed_across_databases)

and extending the function comment to say that it's used for per-database
stats and that it resets both the passed-in stats object as well as
pg_stat_database?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option
Next
From: Andrew Dunstan
Date:
Subject: Re: Granting SET and ALTER SYSTE privileges for GUCs