Re: per backend I/O statistics - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: per backend I/O statistics |
Date | |
Msg-id | Z0QjeIkwC0HNI16K@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: per backend I/O statistics (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
List | pgsql-hackers |
Hi, On Mon, Nov 25, 2024 at 10:06:44AM +0900, Michael Paquier wrote: > On Fri, Nov 22, 2024 at 07:49:58AM +0000, Bertrand Drouvot wrote: > > On Fri, Nov 22, 2024 at 10:36:29AM +0900, Michael Paquier wrote: > >> Hmm. created_entry only matters for pgstat_init_function_usage(). > >> All the other callers of pgstat_prep_pending_entry() pass a NULL > >> value. > > > > I meant to say all the calls that passe "create" as true in pgstat_get_entry_ref(). > > Ah, OK, I think that I see your point here. > > I am wondering how much this would matter as well for custom stats, > but we're not there yet without at least one release out and folks try > new things with these APIs and variable-numbered kinds. Not sure here, could custom stats start incrementing before the database system is ready to accept connections? > pgstat_prep_pending_entry() to return NULL even if "create" is true > may be a good thing, at the end, because that's the only way I can see > based on the current APIs where we could say "Sorry, but the stats > have not been loaded yet, so you cannot try to do anything related to > the dshash". Yeah, same here. > From my view having a kind of barrier would be cleaner in the long > run, but it's true that it may not be mandatory, as well. pg_stat_io > is currently OK to be called because the stats are loaded for > auxiliary processes because it uses fixed-numbered stats in shmem. > And it means we already have early calls that add stats getting > overwritten once the stats are loaded from the on-disk file (Am I > getting this part right?). Yeah, we can already see that, for example, the background writer could enter pgstat_io_flush_cb() before the stats are reset or restored. > Anyway, do we really require that for the sake of this thread? We > know that there's only one of each auxiliary process at a time, and > they keep a footprint in pg_stat_io already. So we could just limit > outselves to live database backends, WAL senders and autovacuum > workers, everything that's not auxiliary and spawned on request? I think that's a fair starting point and that we will not lose any informations doing so (as you said there is only one of each auxiliary process at a time, so that one could already see their stats from pg_stat_io). The only cons that I can see is that we will not be able to merge the flush cb but I don't think that's a blocker (the flush are done in shared memory so the impact on performance should not be that much of an issue). I'll come back with a new version implementing the above. [1]: https://www.postgresql.org/message-id/Zz9sno%2BJJbWqdXhQ%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: