Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
Date | |
Msg-id | 20221116195232.wynvnlng2ivne76b@awork3.anarazel.de Whole thread Raw |
In response to | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
|
List | pgsql-hackers |
Hi, On 2022-11-16 14:19:32 -0500, Robert Haas wrote: > I have never really understood why we drive background writer or > checkpointer statistics through the statistics collector. To some degree it is required for durability - the stats system needs to know how to write out those stats. But that wasn't ever a good reason to send messages to the stats collector - it could just read the stats from shared memory after all. There's also integration with snapshots of the stats, resetting them, etc. There's also the complexity that some of the stats e.g. for checkpointer aren't about work the checkpointer did, but just have ended up there for historical raisins. E.g. the number of fsyncs and writes done by backends. See below: > Here again, for things like table statistics, there is no choice, because we > could have an unbounded number of tables and need to keep statistics about > all of them. The statistics collector can handle that by allocating more > memory as required. But there is only one background writer and only one > checkpointer, so that is not needed in those cases. Why not just have them > expose anything they want to expose through shared memory directly? That's how it is in 15+. The memory for "fixed-numbered" or "global" statistics are maintained by the stats system, but in plain shared memory, allocated at server start. Not via the hash table. Right now stats updates for the checkpointer use the "changecount" approach to updates. For now that makes sense, because we update the stats only occasionally (after a checkpoint or when writing in CheckpointWriteDelay()) - a stats viewer seeing the checkpoint count go up, without yet seeing the corresponding buffers written would be misleading. I don't think we'd want every buffer write or whatnot go through the changecount mechanism, on some non-x86 platforms that could be noticable. But if we didn't stage the stats updates locally I think we could make most of the stats changes without that overhead. For updates that just increment a single counter there's simply no benefit in the changecount mechanism afaict. I didn't want to do that change during the initial shared memory stats work, it already was bigger than I could handle... It's not quite clear to me what the best path forward is for buf_written_backend / buf_fsync_backend, which currently are reported via the checkpointer stats. I think the best path might be to stop counting them via the CheckpointerShmem->num_backend_writes etc and just populate the fields in the view (for backward compat) via the proposed [1] pg_stat_io patch. Doing that accounting with CheckpointerCommLock held exclusively isn't free. > If the statistics collector provides services that we care about, like > persisting data across restarts or making snapshots for transactional > behavior, then those might be reasons to go through it even for the > background writer or checkpointer. But if so, we should be explicit > about what the reasons are, both in the mailing list discussion and in > code comments. Otherwise I fear that we'll just end up doing something > in a more complicated way than is really necessary. I tried to provide at least some of that in the comments at the start of pgstat.c in 15+. There's very likely more that should be added, but I think it's a decent start. Greetings, Andres Freund [1] https://www.postgresql.org/message-id/CAOtHd0ApHna7_p6mvHoO%2BgLZdxjaQPRemg3_o0a4ytCPijLytQ%40mail.gmail.com
pgsql-hackers by date: