Re: Pluggable cumulative statistics - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Pluggable cumulative statistics
Date
Msg-id Zoc_x9JiKPkGkJoL@paquier.xyz
Whole thread Raw
In response to Re: Pluggable cumulative statistics  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses Re: Pluggable cumulative statistics
Re: Pluggable cumulative statistics
List pgsql-hackers
On Thu, Jul 04, 2024 at 11:30:17AM +0000, Bertrand Drouvot wrote:
> On Wed, Jul 03, 2024 at 06:47:15PM +0900, Michael Paquier wrote:
>> among the following lines:
>> - PgStat_ShmemControl holds an array of void* indexed by
>> PGSTAT_NUM_KINDS, pointing to shared memory areas allocated for each
>> fixed-numbered stats.  Each entry is allocated a size corresponding to
>> PgStat_KindInfo->shared_size.
>
> That makes sense to me, and that's just a 96 bytes overhead (8 * PGSTAT_NUM_KINDS)
> as compared to now.

pgstat_io.c is by far the largest chunk.

>> - PgStat_Snapshot holds an array of void* also indexed by
>> PGSTAT_NUM_KINDS, pointing to the fixed stats stored in the
>> snapshots.
>
> Same, that's just a 96 bytes overhead (8 * PGSTAT_NUM_KINDS) as compared to now.

Still Andres does not seem to like that much, well ;)

> Looking at 0001:
>
> 1 ===
>
> In the commit message:
>
>     - Fixed numbered stats now set shared_size, so as
>
> Is something missing in that sentence?

Right.  This is missing a piece.

> -       PgStatShared_Archiver archiver;
> -       PgStatShared_BgWriter bgwriter;
> -       PgStatShared_Checkpointer checkpointer;
> -       PgStatShared_IO io;
> -       PgStatShared_SLRU slru;
> -       PgStatShared_Wal wal;
> +       void       *fixed_data[PGSTAT_NUM_KINDS];
>
> Can we move from PGSTAT_NUM_KINDS to the exact number of fixed stats? (add
> a new define PGSTAT_NUM_FIXED_KINDS for example). That's not a big deal but we
> are allocating some space for pointers that we won't use. Would need to change
> the "indexing" logic though.
>
> 3 ===
>
> Same as 2 === but for PgStat_Snapshot.
>

True for both.  Based on the first inputs I got from Andres, the
built-in fixed stats structures would be kept as they are now, and we
could just add an extra member here for the custom fixed stats.  That
still results in a few bytes wasted as not all custom stats want fixed
stats, but that's much cheaper.

> 4 ===
>
> +static void pgstat_init_snapshot(void);
>
> what about pgstat_init_snapshot_fixed? (as it is for fixed-numbered statistics
> only).

Sure.

> 5 ===
>
> +       /* Write various stats structs with fixed number of objects */
>
> s/Write various stats/Write the stats/? (not coming from your patch but they
> all were listed before though).

Yes, there are a few more things about that.

> 6 ===
>
> +       for (int kind = PGSTAT_KIND_FIRST_VALID; kind <= PGSTAT_KIND_LAST; kind++)
> +       {
> +               char       *ptr;
> +               const PgStat_KindInfo *info = pgstat_get_kind_info(kind);
> +
> +               if (!info->fixed_amount)
> +                       continue;
>
> Nit: Move the "ptr" declaration into an extra else? (useless to declare it
> if it's not a fixed number stat)

Comes down to one's taste.  I think that this is OK as-is, but that's
my taste.

> 7 ===
>
> +               /* prepare snapshot data and write it */
> +               pgstat_build_snapshot_fixed(kind);
>
> What about changing pgstat_build_snapshot_fixed() to accept a PgStat_KindInfo
> parameter (instead of the current PgStat_Kind one)? Reason is that
> pgstat_get_kind_info() is already called/known in pgstat_snapshot_fixed(),
> pgstat_build_snapshot() and pgstat_write_statsfile(). That would avoid
> pgstat_build_snapshot_fixed() to retrieve (again) the kind_info.

pgstat_snapshot_fixed() only calls pgstat_get_kind_info() with
assertions enabled.  Perhaps we could do that, just that it does not
seem that critical to me.

> 8 ===
>
> /*
>  * Reads in existing statistics file into the shared stats hash.
>
> This comment above pgstat_read_statsfile() is not correct, fixed stats
> are not going to the hash (was there before your patch though).

Good catch.  Let's adjust that separately.

> 9 ===
>
> +pgstat_archiver_init_shmem_cb(void *stats)
> +{
> +       PgStatShared_Archiver *stats_shmem = (PgStatShared_Archiver *) stats;
> +
> +       LWLockInitialize(&stats_shmem->lock, LWTRANCHE_PGSTATS_DATA);
>
> Nit: Almost all the pgstat_XXX_init_shmem_cb() look very similar, I wonder if we
> could use a macro to avoid code duplication.

They are very similar, still can do different things like pgstat_io.
I am not sure that the macro would bring more readability.

> Remark not related to this patch: I think we could get rid of the shared_data_off
> for the fixed stats (by moving the "stats" part at the header of their dedicated
> struct). That would mean having things like:
>
> "
> typedef struct PgStatShared_Archiver
> {
>     PgStat_ArchiverStats stats;
>     /* lock protects ->reset_offset as well as stats->stat_reset_timestamp */
>     LWLock      lock;
>     uint32      changecount;
>     PgStat_ArchiverStats reset_offset;
> } PgStatShared_Archiver;
> "

I'm not really convinced that it is a good idea to force the ordering
of the members in the shared structures for the fixed-numbered stats,
requiring these "stats" fields to always be first.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostgreSQL does not compile on macOS SDK 15.0
Next
From: David Rowley
Date:
Subject: Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE