Re: [Proposal] Adding callback support for custom statistics kinds - Mailing list pgsql-hackers
| From | Michael Paquier |
|---|---|
| Subject | Re: [Proposal] Adding callback support for custom statistics kinds |
| Date | |
| Msg-id | aS99yqkJF6oq4q9K@paquier.xyz Whole thread Raw |
| In response to | Re: [Proposal] Adding callback support for custom statistics kinds (Sami Imseih <samimseih@gmail.com>) |
| List | pgsql-hackers |
On Tue, Dec 02, 2025 at 02:58:32PM -0600, Sami Imseih wrote:
> By "template" code, do you mean Something like?
>
> include/utils/custom_statkinds.h
> backend/utils/misc/custom_statkinds.c
>
> Where the template code here is PgStat_kind definition, callbacks, etc. for
> injection_points or the new test module that is using a custom kind.
I am not completely sure that we need a separate C file for a portion
of the code related to custom stats kinds. At least, I am not sure to
see which part of pgstat.c, pgstat_internal.h and pgstat_shmem.c could
be extracted so as a custom_statkinds.c could have value when taken
independently. A test module makes the most sense for such templates
IMO, as they can be compiled and checked directly.
> v2-0001 are much simplified changes to pgstat.c that simply invoke the callbacks
> and all the work is on the extension to implement what it needs to do.
> This includes
> a callback at the end of WRITE, READ, DISCARD with a flag passed to the caller
> so they can perform the necessary clean-up actions.
+ void (*to_serialized_extra_stats) (const PgStat_HashKey *key,
+ const PgStatShared_Common *header, FILE *statfile);
+ void (*from_serialized_extra_stats) (const PgStat_HashKey *key,
+ const PgStatShared_Common *header, FILE *statfile);
+ void (*end_extra_stats) (PgStat_StatsFileOp status);
[...]
+typedef struct PgStatShared_CustomEntry
+{
+ PgStatShared_Common header;
+ PgStat_StatCustomEntry stats;
+ char name[NAMEDATALEN];
+ dsa_pointer description;
+} PgStatShared_CustomEntry;
I'm cool with this design, including your point about using a DSA
pointer in a stats entry, manipulating this data through the
serialization callback. Your module does not use the FILE* which
points to the main stats file for the to/from extra serialized
callbacks, it seems important to document in pgstat_internal.h that
this points to the "main" pgstats file manipulated by the startup
process when loading or by the checkpointer when flushing.
Perhaps the callback in the module for end_extra_stats should use a
switch based on PgStat_StatsFileOp. Minor point.
+/* File handle for statistics serialization */
+static FILE *fd = NULL;
Using a fd tracked directly by the module sounds good to me, yes.
That gives to the modules the flexibility to decide what should be the
extra files to know about, some file name patterns being possible to
decide based on the stats entry keys that need to be written, with
files opened when actually required.
> v2-0002 implements a new test module that tests mainly that the recovery,
> clean and crash, are working as expected.
That looks like a good direction to me. The only differences I can
see with the stats module in injection_points for variable-sized stats
is that this new module does not check pgstat_drop_entry() and
pgstat_fetch_entry() when working on a custom stats kind. If we had
SQL interfaces calling these two, we could just remove
injection_stats.c entirely, moving everything to this new test module.
I should have invented a new module from the start, perhaps, but well,
that was good enough to check the basic APIs when working on the
custom APIs. Removing this duplication would be my own business with
your module in the tree, no need for you to worry about that. That
would also remove the tweak you have used regarding the duplicated
kind ID.
Perhaps we should do the same for the fixed-sized kind at the end, and
instead of using one .so for both of them, we could just create a
separate .so with multiple entries in MODULES? What do you think?
What you have here is better than what's in the tree in terms of
module separation for HEAD.
> I created a new tap test for this which performs a test similar to what is
> done in recovery/029_stats_restart.pl. I could merge the new test there, but
> I am reluctant to add a dependency on a new module to recovery. What
> do you think?
Adding an extra item to recovery's EXTRA_INSTALL would be OK for me,
but it seems cleaner to me to keep the tests related to custom stats
in their own area like your patch 0002 is doing with its new test
module test_custom_statkind. And 029_stats_restart.pl is already
covering a lot of ground.
+ if (pgstat_is_kind_custom(key.kind) && kind_info->from_serialized_extra_stats)
+ kind_info->from_serialized_extra_stats(&key, header, fpin);
[...]
+ if (pgstat_is_kind_custom(ps->key.kind) && kind_info->to_serialized_extra_stats)
+ kind_info->to_serialized_extra_stats(&ps->key, shstats, fpout);
These restrictions based on custom kinds do not seem mandatory.
Why not allowing built-in kinds the same set of operations?
+ /* Read and verify the hash key */
+ if (!pgstat_read_chunk(fd, (void *) key, sizeof(PgStat_HashKey)))
+ return;
[...]
+ /* Write the hash key to identify this entry */
+ pgstat_write_chunk(fd, (void *) key, sizeof(PgStat_HashKey));
I am puzzled by this part of 0002. Why are you overwriting the key
once after loading it from the main pgstats file? Writing the key to
cross-check that the data matches with what is in the main file is OK,
and this should be ensured because of the ordering of the data. I
would have done it in a slightly different way, I guess, with the data
stored on disk in the main pgstats file including an offset to know
where to search in the secondary file. That's what we would do for
PGSS as well, I guess, with the secondary file including data
structured as a set of:
- Entry key, cross-checked with the data read from the main file,
based on the offset stored in the main file.
- Length of extra data.
- The extra data contents.
As a whole, I find this patch pretty cool, particularly the point
about extending stats entries with DSAs, something that would be
essential for PGSS and move it to use pgstats because we don't want
the query strings in the main pgstats file and bloat it. Nice.
--
Michael
Attachment
pgsql-hackers by date: