Re: per backend I/O statistics - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: per backend I/O statistics
Date
Msg-id ZwOvzX219kDhEEMu@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: per backend I/O statistics  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi,

On Fri, Sep 20, 2024 at 01:26:49PM +0900, Michael Paquier wrote:
> On Tue, Sep 17, 2024 at 01:56:34PM +0000, Bertrand Drouvot wrote:
> > No problem at all! (I re-explained because I'm not always 100% sure that my
> > explanations are crystal clear ;-) )
> 
> We've discussed a bit this patch offline, but after studying the patch
> I doubt that this part is a good idea:
> 
> +    /* has to be at the end due to FLEXIBLE_ARRAY_MEMBER */
> +    PgStatShared_IO io;
>  } PgStat_ShmemControl;
> 
> We are going to be in trouble if we introduce a second member in this
> routine that has a FLEXIBLE_ARRAY_MEMBER, because PgStat_ShmemControl
> relies on the fact that all its members after deterministic offset
> positions in this structure.

Agree that it would be an issue should we have to add a new FLEXIBLE_ARRAY_MEMBER.

> So this lacks flexibility.  This choice
> is caused by the fact that we don't exactly know the number of
> backends because that's controlled by the PGC_POSTMASTER GUC
> max_connections so the size of the structure would be undefined.

Right.

> There is a parallel with replication slot statistics here, where we
> save the replication slot data in the dshash based on their index
> number in shmem.  Hence, wouldn't it be better to do like replication
> slot stats, where we use the dshash and a key based on the procnum of
> each backend or auxiliary process (ProcNumber in procnumber.h)?  If at
> restart max_connections is lower than what was previously used, we
> could discard entries that would not fit anymore into the charts.
> This is probably not something that happens often, so I'm not really
> worried about forcing the removal of these stats depending on how the
> upper-bound of ProcNumber evolves.

Yeah, I'll look at implementing the dshash based on their procnum and see
where it goes.

> So, using a new kind of ID and making this kind variable-numbered may
> ease the implementation quite a bit, while avoiding any extensibility
> issues with the shmem portion of the patch if these are
> fixed-numbered.

Agree.

> This would rely on the fact that we would use the ProcNumber for the
> dshash key, and this information is not provided in pg_stat_activity.
> Perhaps we should add this information in pg_stat_activity so as it
> would be easily possible to do joins with a SQL function that returns
> a SRF with all the stats associated with a given connection slot
> (auxiliary or backend process)?

I'm not sure that's needed. What has been done in the previous versions is
to get the stats based on the pid (see pg_stat_get_backend_io()) where the
procnumber is retrieved with something like GetNumberFromPGProc(BackendPidGetProc(pid)).

Do you see an issue with that approach?

> The active PIDs of the live sessions are not stored in the active
> stats, why not? 

That was not needed. We can still retrieve the stats based on the pid thanks
to something like GetNumberFromPGProc(BackendPidGetProc(pid)) without having
to actually store the pid in the stats. I think that's fine because the pid
only matters at "display" time (pg_stat_get_backend_io()).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: per backend I/O statistics
Next
From: Daniel Gustafsson
Date:
Subject: Re: Commit fest 2024-09