Thread: bgwriter statistics

bgwriter statistics

From
Jim Nasby
Date:
Now that we've got a nice amount of tuneability in the bgwriter, it  
would be nice if we had as much insight into how it's actually doing.  
I'd like to propose that the following info be added to the stats  
framework to assist in tuning it:

bgwriter_rounds - number of rounds that have run
bgwriter_lru_percent_scanned - total pages of the LRU end of the  
buffer pool scanned on each round
bgwriter_lru_pages_written - total LRU pages written
bgwriter_all_percent_scanned - total pages scanned for all of the  
buffer pool
bgwriter_all_pages_written - total pages written for all the buffer pool

To clarify: the 'all' statistics should correspond to  
bgwriter_all_percent and bgwriter_all_maxpages GUC's

Unfortunately, the above information doesn't tell you why (on  
average) the bgwriter is stopping each of it's scans (ie: did it hit  
the scan percentage limit, or did it hit the pages written limit). I  
think the next two would provide insight into that:

bgwriter_lru_scan_limit_hit - number of rounds where  
bgwriter_lru_percent was hit
bgwriter_all_scan_limit_hit - ditto for bgwriter_all_percent

Finally, the real reason for bgwriter's existence is to prevent the  
need to write many pages out during a checkpoint, so to monitor that:

checkpoint_timeouts - number of times we've hit checkpoint_timeout
checkpoint_timeout_pages - number of pages written during timeout  
checkpoints

checkpoint_segment_overflow, checkpoint_segment_pages - same thing,  
but for checkpoints that were forced because we ran out of WAL files

I suppose for completeness sake we should add stats_bgwriter and  
stats_checkpoint GUC's, though I don't see any issue with just  
leaving these turned on unless there's some folks out there running  
very low bgwriter_delay settings. Also, I'm wondering if it would be  
useful to have a way to reset just these statistics. If you're tuning  
things, you'll want to be able to see what effect your changes are  
having, which is a bit difficult without reseting the counters unless  
you've got them feeding into MRTG or something. But reseting all the  
counters would be rather bad if you're using autovacuum.

Comments?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461




Re: bgwriter statistics

From
"Qingqing Zhou"
Date:
"Jim Nasby" <jnasby@pervasive.com> wrote
> Now that we've got a nice amount of tuneability in the bgwriter, it 
> would be nice if we had as much insight into how it's actually doing. 
> I'd like to propose that the following info be added to the stats 
> framework to assist in tuning it:
>

In general, I think it is a good idea to add more statistics to the 
backend. As to the stats you proposed, basically if we record enough 
information of each bgwriter BgBufferSync round, we can get everything 
as you needed. These information would be:
   - round   - bgwriter_lru_percent_scanned/bgwriter_lru_percent (we need it 
because SIGHUP)   - bgwriter_lru_pages_written/bgwriter_lru_maxpages   -
bgwriter_all_percent_written/bgwriter_all_percent  - bgwriter_all_maxpages_written/bgwriter_all_maxpages   - start
time,end time
 

For above items, you will know how many rounds you've done, and what's 
the reason that the bgwriter gone. For checkpoint, we can have similar 
numbers.

Except for the bgwriter stats and checkpoints stats you mentioned, we 
may also need some information on buffer pool write stats. This is 
because another usefulness of bgwriter is to ensure that buffers that 
will be recycled soon are clean when needed. I think we've already have 
a counter for each relation, but not for all. There are two ways to do 
it, one is do some maths on all the relations to calculate it, the other 
way is to maintain a loosy counter in the shared memory - by this way, 
we can get the stats any time even if we don't enable stats. So I prefer 
the second.

Regards,
Qingqing 




Re: bgwriter statistics

From
ITAGAKI Takahiro
Date:
On 2006-06-02 21:26, Jim Nasby wrote:

> Now that we've got a nice amount of tuneability in the bgwriter, it
> would be nice if we had as much insight into how it's actually doing.
> I'd like to propose that the following info be added to the stats
> framework to assist in tuning it:

I'm interested in your idea. You want to know what bgwriter does.
Also, I think there is another perspective; what bgwriter *should* do.
I imagine the information that pages are dirty or not is useful for
the purpose.

- dirty_pages:       The number of pages with BM_DIRTY in the buffer pool.
- replaced_dirty:       Total replaced pages with BM_DIRTY.       Backends should write the pages themselves.
- replaced_clean:       Same as above, but without BM_DIRTY.       Backends can replace them freely.

Bgwriter should boost ALL activity if dirty_pages is high,
and boost LRU activity if replaced_dirty is high.
In ideal, the parameters of bgwriter can be tuned almost automatically:

- LRU scans = replaced_dirty + replaced_clean
- LRU writes = replaced_dirty
- ALL scans/writes = the value that can keep dirty_pages low


However, tracking the number of dirty pages is not free. I suppose
the implementation should be well considered to avoid lock contentions.

Comments are welcome.

---
ITAGAKI Takahiro
NTT OSS Center