Re: per backend I/O statistics - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: per backend I/O statistics
Date
Msg-id ZyMRJIbUpNPoCXUe@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: per backend I/O statistics  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
Hi,

On Tue, Oct 08, 2024 at 04:28:39PM +0000, Bertrand Drouvot wrote:
> > > On Fri, Sep 20, 2024 at 01:26:49PM +0900, Michael Paquier wrote:
> > 
> > Okay, per the above and the persistency of the stats.
> 
> Great, I'll work on an updated patch version then.
> 

I spend some time on this during the last 2 days and I think we have 3 design
options.

=== GOALS ===

But first let's sump up the goals that I think we agreed on:

- Keep pg_stat_io as it is today: give the whole server picture and serialize
the stats to disk.

- Introduce per-backend IO stats and 2 new APIs to:

   1. Provide the IO stats for "my backend" (through say pg_my_stat_io), this
      would take care of the stats_fetch_consistency.

   2. Retrieve the IO stats for another backend (through say pg_stat_get_backend_io(pid))
      that would _not_ take care of stats_fetch_consistency, as:

      2.1/ I think that there is no use case (there is no need to get others
           backends I/O statistics while taking care of the stats_fetch_consistency)

      2.2/ That could be memory expensive to store a snapshot for all the backends
           (depending of the number of backend created)

- There is no need to serialize the per-backend IO stats to disk (no point to
see stats for backends that do not exist anymore after a re-start).

- The per-backend IO stats should be variable-numbered (not fixed), as per 
up-thread discussion.

=== OPTIONS ===

So, based on this, I think that we could:

Option 1: "move" the existing PGSTAT_KIND_IO to variable-numbered and let this
KIND take care of the aggregated view (pg_stat_io) and the per-backend stats.

Option 2: let PGSTAT_KIND_IO as it is and introduce a new PGSTAT_KIND_BACKEND_IO
that would be variable-numbered.

Option 3: Remove PGSTAT_KIND_IO, introduce a new PGSTAT_KIND_BACKEND_IO that
would be variable-numbered and store the "aggregated stats aka pg_stat_io" in
shared memory (not part of the variable-numbered hash). Per-backend stats
could be aggregated into "pg_stat_io" during the flush_pending_cb call for example.

=== BEST OPTION? ===

I would opt for Option 2 as:

- The stats system is currently not designed for Option 1 and our goals (for
example the shared_data_len is used to serialize but also to fetch the entries,
see pgstat_fetch_entry()) so that would need some hack to serialize only a part
of them and still be able to fetch them all).

- Mixing "fixed" and "variable" in the same KIND does not sound like a good idea
(though that might be possible with some hacks, I don't think that would be 
easy to maintain).

- Having the per-backend as "variable" in its dedicated kind looks more reasonable
and less error-prone.

- I don't think there is a stats design similar to option 3 currently, so I'm
not sure there is a need to develop something new while Option 2 could be done.

- Option 3 would need some hack for (at least) the "pg_stat_io" [de]serialization
part.

- Option 2 seems to offer more flexibility (as compare to Option 1 and 3).

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Jingtang Zhang
Date:
Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Next
From: Vladlen Popolitov
Date:
Subject: Re: [PATCH] Add array_reverse() function