pgsql: Introduce pluggable APIs for Cumulative Statistics - Mailing list pgsql-committers

From Michael Paquier
Subject pgsql: Introduce pluggable APIs for Cumulative Statistics
Date
Msg-id E1saYih-002cyk-8Y@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Introduce pluggable APIs for Cumulative Statistics

This commit adds support in the backend for $subject, allowing
out-of-core extensions to plug their own custom kinds of cumulative
statistics.  This feature has come up a few times into the lists, and
the first, original, suggestion came from Andres Freund, about
pg_stat_statements to use the cumulative statistics APIs in shared
memory rather than its own less efficient internals.  The advantage of
this implementation is that this can be extended to any kind of
statistics.

The stats kinds are divided into two parts:
- The in-core "builtin" stats kinds, with designated initializers, able
to use IDs up to 128.
- The "custom" stats kinds, able to use a range of IDs from 128 to 256
(128 slots available as of this patch), with information saved in
TopMemoryContext.  This can be made larger, if necessary.

There are two types of cumulative statistics in the backend:
- For fixed-numbered objects (like WAL, archiver, etc.).  These are
attached to the snapshot and pgstats shmem control structures for
efficiency, and built-in stats kinds still do that to avoid any
redirection penalty.  The data of custom kinds is stored in a first
array in snapshot structure and a second array in the shmem control
structure, both indexed by their ID, acting as an equivalent of the
builtin stats.
- For variable-numbered objects (like tables, functions, etc.).  These
are stored in a dshash using the stats kind ID in the hash lookup key.

Internally, the handling of the builtin stats is unchanged, and both
fixed and variabled-numbered objects are supported.  Structure
definitions for builtin stats kinds are renamed to reflect better the
differences with custom kinds.

Like custom RMGRs, custom cumulative statistics can only be loaded with
shared_preload_libraries at startup, and must allocate a unique ID
shared across all the PostgreSQL extension ecosystem with the following
wiki page to avoid conflicts:
https://wiki.postgresql.org/wiki/CustomCumulativeStats

This makes the detection of the stats kinds and their handling when
reading and writing stats much easier than, say, allocating IDs for
stats kinds from a shared memory counter, that may change the ID used by
a stats kind across restarts.  When under development, extensions can
use PGSTAT_KIND_EXPERIMENTAL.

Two examples that can be used as templates for fixed-numbered and
variable-numbered stats kinds will be added in some follow-up commits,
with tests to provide coverage.

Some documentation is added to explain how to use this plugin facility.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/7949d9594582ab49dee221e1db1aa5401ace49d4

Modified Files
--------------
doc/src/sgml/xfunc.sgml                   |  57 ++++++++
src/backend/utils/activity/pgstat.c       | 233 +++++++++++++++++++++++++++---
src/backend/utils/activity/pgstat_shmem.c |  31 +++-
src/backend/utils/adt/pgstatfuncs.c       |   2 +-
src/include/pgstat.h                      |  36 ++++-
src/include/utils/pgstat_internal.h       |  22 ++-
6 files changed, 348 insertions(+), 33 deletions(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: Use CXXFLAGS instead of CFLAGS for linking C++ code
Next
From: Alexander Korotkov
Date:
Subject: pgsql: pg_wal_replay_wait(): Fix typo in the doc