Re: per backend WAL statistics - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: per backend WAL statistics |
Date | |
Msg-id | Z8VzqpBkj6ja5+EN@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: per backend WAL statistics (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: per backend WAL statistics
Re: per backend WAL statistics |
List | pgsql-hackers |
Hi, On Mon, Mar 03, 2025 at 10:48:23AM +0900, Michael Paquier wrote: > On Fri, Feb 28, 2025 at 09:26:08AM +0000, Bertrand Drouvot wrote: > > Also attaching the patch I mentioned up-thread to address some of Rahila's > > comments ([1]): It adds a AuxiliaryPidGetProc() call in pgstat_fetch_stat_backend_by_pid() > > and pg_stat_reset_backend_stats(). I think that fully makes sense since a051e71e28a > > modified pgstat_tracks_backend_bktype() for B_WAL_RECEIVER, B_WAL_SUMMARIZER > > and B_WAL_WRITER. > > Okay by me as it makes the code automatically more flexible if > pgstat_tracks_backend_bktype() gets tweaked, including the call of > pgstat_flush_backend() in pgstat_report_wal() so as the WAL writer is > able to report backend stats for its WAL I/O. Applied this part as of > 3f1db99bfabb. Thanks! > Something that's still not quite right is that the WAL receiver and > the WAL summarizer do not call pgstat_report_wal() at all, so we don't > report much data and we expect these processes to run continuously. > The location where to report stats for the WAL summarizer is simple, > even if the system is aggressive with WAL this is never called more > than a couple of times per seconds, like the WAL writer: > > @@ -1541,6 +1542,10 @@ summarizer_read_local_xlog_page(XLogReaderState *state, > * so we don't tight-loop. > */ > HandleWalSummarizerInterrupts(); > + > + /* report pending statistics to the cumulative stats system */ > + pgstat_report_wal(false); > + > summarizer_wait_for_wal(); > > At this location, the WAL summarizer would wait as there is no data to > read. The hot path is when we're reading a block. Did not look closely enough but that sounds right after a quick look. > The WAL receiver is a different story, because the WaitLatchOrSocket() > call in the main loop of WalReceiverMain() is *very* aggressive, and > it's easy to reach this code dozens of times each millisecond. In > short, we need to be careful, I think, based on how this is currently > written. My choice is then this path: > --- a/src/backend/replication/walreceiver.c > +++ b/src/backend/replication/walreceiver.c > @@ -583,6 +583,10 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len) > */ > bool requestReply = false; > > + /* report pending statistics to the cumulative stats system */ > + pgstat_report_wal(false); > + > /* > * Check if time since last receive from prim > > This would update the stats only when the WAL receiver has nothing to > do or if wal_receiver_status_interval is reached, so we're not going > to storm pgstats with updates, still we get some data on a periodic > basis *because* wal_receiver_status_interval would make sure that the > path is taken even if we're under a lot of WAL pressure when sending > feedback messages back to the WAL sender. Of course this needs a > pretty good comment explaining the choice of this location. What do > you think? Same as above, that sounds right after a quick look. > > It looks like it does not need doc updates. Attached as 0002 as it's somehow > > un-related to this thread (but not sure it deserves it's dedicated thread though). > > I'm wondering if we should not lift more the list of processes listed > in pgstat_tracks_backend_bktype() and include B_AUTOVAC_LAUNCHER, > B_STARTUP, B_CHECKPOINTER, B_BG_WRITER at this stage, removing the > entire paragraph. Not sure if we really have to do that for this > release, but we could look at that separately. hm, do you mean update the comment on top of pgstat_tracks_backend_bktype() or update the documentation? > With 3f1db99bfabb in place, wouldn't it be simpler to update > pgstat_report_wal() in v12-0001 so as we use PGSTAT_BACKEND_FLUSH_ALL > with one call of pgstat_flush_backend()? This saves one call for each > stats flush. hmm, that would work as long as PGSTAT_BACKEND_FLUSH_ALL represents things that need to be called from pgstat_report_wal(). But I think that's open door for issue should be add a new PGSTAT_BACKEND_FLUSH_XXX where XXX is not related to pgstat_report_wal() at all. So, I'm tempted to keep it as it is. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: