Re: per backend I/O statistics - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: per backend I/O statistics
Date
Msg-id Zta482TuN7ri1Bis@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: per backend I/O statistics  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: per backend I/O statistics
List pgsql-hackers
Hi,

On Tue, Sep 03, 2024 at 03:37:49PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 2 Sep 2024 14:55:52 +0000, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote in 
> > Hi hackers,
> > 
> > Please find attached a patch to implement $SUBJECT.
> > 
> > While pg_stat_io provides cluster-wide I/O statistics, this patch adds a new
> > pg_my_stat_io view to display "my" backend I/O statistics and a new
> > pg_stat_get_backend_io() function to retrieve the I/O statistics for a given
> > backend pid.
> > 
> 
> I'm not sure about the usefulness of having the stats only available
> from the current session. Since they are stored in shared memory,
> shouldn't we make them accessible to all backends?

Thanks for the feedback!

The stats are accessible to all backends thanks to 0002 and the introduction
of the pg_stat_get_backend_io() function.

> However, this would
> introduce permission considerations and could become complex.

Not sure that the data exposed here is sensible enough to consider permission
restriction.

> When I first looked at this patch, my initial thought was whether we
> should let these stats stay "fixed." The reason why the current
> PGSTAT_KIND_IO is fixed is that there is only one global statistics
> storage for the entire database. If we have stats for a flexible
> number of backends, it would need to be non-fixed, perhaps with the
> entry for INVALID_PROC_NUMBER storing the global I/O stats, I
> suppose. However, one concern with that approach would be the impact
> on performance due to the frequent creation and deletion of stats
> entries caused by high turnover of backends.
>

The pros of using the fixed amount are:

- less code change (I think as I did not write the non fixed counterpart)
- probably better performance and less scalabilty impact (in case of high rate
of backends creation/ deletion)

Cons is probably allocating shared memory space that might not be used (
sizeof(PgStat_IO) is 16392 so that could be a concern for a high number of
unused connection). OTOH, if a high number of connections is not used that might
be worth to reduce the max_connections setting.

"Conceptually" speaking, we know what the maximum number of backend is, so I
think that using the fixed amount approach makes sense (somehow I think it can
be compared to PGSTAT_KIND_SLRU which relies on SLRU_NUM_ELEMENTS).

> Just to be clear, the above comments are not meant to oppose the
> current implementation approach. They are purely for the sake of
> discussing comparisons with other possible approaches.

No problem at all, thanks for your feedback and sharing your thoughts!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tender Wang
Date:
Subject: Re: not null constraints, again
Next
From: Peter Smith
Date:
Subject: Re: DOCS - pg_replication_slot . Fix the 'inactive_since' description