Thread: bgwriter statistics
Now that we've got a nice amount of tuneability in the bgwriter, it would be nice if we had as much insight into how it's actually doing. I'd like to propose that the following info be added to the stats framework to assist in tuning it: bgwriter_rounds - number of rounds that have run bgwriter_lru_percent_scanned - total pages of the LRU end of the buffer pool scanned on each round bgwriter_lru_pages_written - total LRU pages written bgwriter_all_percent_scanned - total pages scanned for all of the buffer pool bgwriter_all_pages_written - total pages written for all the buffer pool To clarify: the 'all' statistics should correspond to bgwriter_all_percent and bgwriter_all_maxpages GUC's Unfortunately, the above information doesn't tell you why (on average) the bgwriter is stopping each of it's scans (ie: did it hit the scan percentage limit, or did it hit the pages written limit). I think the next two would provide insight into that: bgwriter_lru_scan_limit_hit - number of rounds where bgwriter_lru_percent was hit bgwriter_all_scan_limit_hit - ditto for bgwriter_all_percent Finally, the real reason for bgwriter's existence is to prevent the need to write many pages out during a checkpoint, so to monitor that: checkpoint_timeouts - number of times we've hit checkpoint_timeout checkpoint_timeout_pages - number of pages written during timeout checkpoints checkpoint_segment_overflow, checkpoint_segment_pages - same thing, but for checkpoints that were forced because we ran out of WAL files I suppose for completeness sake we should add stats_bgwriter and stats_checkpoint GUC's, though I don't see any issue with just leaving these turned on unless there's some folks out there running very low bgwriter_delay settings. Also, I'm wondering if it would be useful to have a way to reset just these statistics. If you're tuning things, you'll want to be able to see what effect your changes are having, which is a bit difficult without reseting the counters unless you've got them feeding into MRTG or something. But reseting all the counters would be rather bad if you're using autovacuum. Comments? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim Nasby" <jnasby@pervasive.com> wrote > Now that we've got a nice amount of tuneability in the bgwriter, it > would be nice if we had as much insight into how it's actually doing. > I'd like to propose that the following info be added to the stats > framework to assist in tuning it: > In general, I think it is a good idea to add more statistics to the backend. As to the stats you proposed, basically if we record enough information of each bgwriter BgBufferSync round, we can get everything as you needed. These information would be: - round - bgwriter_lru_percent_scanned/bgwriter_lru_percent (we need it because SIGHUP) - bgwriter_lru_pages_written/bgwriter_lru_maxpages - bgwriter_all_percent_written/bgwriter_all_percent - bgwriter_all_maxpages_written/bgwriter_all_maxpages - start time,end time For above items, you will know how many rounds you've done, and what's the reason that the bgwriter gone. For checkpoint, we can have similar numbers. Except for the bgwriter stats and checkpoints stats you mentioned, we may also need some information on buffer pool write stats. This is because another usefulness of bgwriter is to ensure that buffers that will be recycled soon are clean when needed. I think we've already have a counter for each relation, but not for all. There are two ways to do it, one is do some maths on all the relations to calculate it, the other way is to maintain a loosy counter in the shared memory - by this way, we can get the stats any time even if we don't enable stats. So I prefer the second. Regards, Qingqing
On 2006-06-02 21:26, Jim Nasby wrote: > Now that we've got a nice amount of tuneability in the bgwriter, it > would be nice if we had as much insight into how it's actually doing. > I'd like to propose that the following info be added to the stats > framework to assist in tuning it: I'm interested in your idea. You want to know what bgwriter does. Also, I think there is another perspective; what bgwriter *should* do. I imagine the information that pages are dirty or not is useful for the purpose. - dirty_pages: The number of pages with BM_DIRTY in the buffer pool. - replaced_dirty: Total replaced pages with BM_DIRTY. Backends should write the pages themselves. - replaced_clean: Same as above, but without BM_DIRTY. Backends can replace them freely. Bgwriter should boost ALL activity if dirty_pages is high, and boost LRU activity if replaced_dirty is high. In ideal, the parameters of bgwriter can be tuned almost automatically: - LRU scans = replaced_dirty + replaced_clean - LRU writes = replaced_dirty - ALL scans/writes = the value that can keep dirty_pages low However, tracking the number of dirty pages is not free. I suppose the implementation should be well considered to avoid lock contentions. Comments are welcome. --- ITAGAKI Takahiro NTT OSS Center