Re: per backend I/O statistics - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: per backend I/O statistics
Date
Msg-id ZwOvVnmVt0PjWDjh@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: per backend I/O statistics  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
Hi,

On Fri, Sep 20, 2024 at 12:53:43PM +0900, Michael Paquier wrote:
> On Wed, Sep 04, 2024 at 04:45:24AM +0000, Bertrand Drouvot wrote:
> > On Tue, Sep 03, 2024 at 04:07:58PM +0900, Kyotaro Horiguchi wrote:
> >> As an additional benefit of this approach, the client can set a
> >> connection variable, for example, no_backend_iostats to true, or set
> >> its inverse variable to false, to restrict memory usage to only the
> >> required backends.
> > 
> > Thanks for the feedback!
> > 
> > If we were to add an on/off switch button, I think I'd vote for a global one
> > instead. Indeed, I see this feature more like an "Administrator" one, where
> > the administrator wants to be able to find out which session is reponsible of
> > what (from an I/O point of view): like being able to anwser "which session is
> > generating this massive amount of reads"?
> > 
> > If we allow each session to disable the feature then the administrator
> > would lost this ability.
> 
> Hmm, I've been studying this patch,

Thanks for looking at it!

> and I am not completely sure to
> agree with this feeling of using fixed-numbered stats, actually, after
> reading the whole and seeing the structure of the patch
> (FLEXIBLE_ARRAY_MEMBER is a new way to handle the fact that we don't
> know exactly the number of slots we need to know for the
> fixed-numbered stats as MaxBackends may change).

Right, that's a new way of dealing with "unknown" number of slots (and it has
cons as you mentioned in [1]).

> If we make these
> kind of stats variable-numbered, does it have to actually involve many
> creations or removals of the stats entries, though?  One point is that
> the number of entries to know about is capped by max_connections,
> which is a PGC_POSTMASTER.  That's the same kind of control as
> replication slots.  So one approach would be to reuse entries in the
> dshash and use in the hashing key the number in the procarrays.  If a
> new connection spawns and reuses a slot that was used in the past,
> then reset all the existing fields and assign its PID.

Yeah, like it's done currently with the "fixed-numbered" stats proposal. That
sounds reasonable to me, I'll look at this proposed approach and come back with
a new patch version, thanks!

> Another thing is the consistency of the data that we'd like to keep at
> shutdown.  If the connections have a balanced amount of stats shared
> among them, doing decision-making based on them is kind of easy.  But
> that may cause confusion if the activity is unbalanced across the
> sessions.  We could also not flush them to disk as an option, but it
> still seems more useful to me to save this data across restarts if one
> takes frequent snapshots of the new system view reporting everything,
> so as it is possible to get an idea of the deltas across the snapshots
> for each connection slot.

The idea that has been implemented so far in this patch is that we still maintain
an aggregated version of the stats (visible through pg_stat_io) and that only the
aggregated stats are flushed/read to/from disk (means we don't flush the
per-backend stats).

I think that it makes sense that way. The way I see it is that the per-backend
I/O stats is more for current activity instrumentation. So it's not clear to me
what would be the benefits of restoring the per-backend stats at startup knowing
that: 1) we restored the aggregated stats and 2) the sessions that were responsible
for the the restored stats are gone.

[1]: https://www.postgresql.org/message-id/Zuz5iQ4AjcuOMx_w%40paquier.xyz

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: ECPG cleanup and fix for clang compile-time problem
Next
From: Bertrand Drouvot
Date:
Subject: Re: per backend I/O statistics