Re: Log levels for checkpoint/bgwriter monitoring - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Log levels for checkpoint/bgwriter monitoring
Date
Msg-id Pine.GSO.4.64.0703082321510.14630@westnet.com
Whole thread Raw
In response to Re: Log levels for checkpoint/bgwriter monitoring  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: Log levels for checkpoint/bgwriter monitoring  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Log levels for checkpoint/bgwriter monitoring  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Re: Log levels for checkpoint/bgwriter monitoring  (Jim Nasby <decibel@decibel.org>)
Re: Log levels for checkpoint/bgwriter monitoring  (Jim Nasby <decibel@decibel.org>)
List pgsql-hackers
On Fri, 9 Mar 2007, ITAGAKI Takahiro wrote:

> In my understanding, each backend pins two or so buffers at once. So 
> percentage of pinned buffers should be low.

With the pgbench workload, a substantial percentage of the buffer cache 
ends up pinned.  From staring at the buffer cache using 
contrib/pg_buffercache, I believe most of that consists of the index 
blocks for the records being updated in the accounts table.

I just posted a new version of the patch I asked for feedback on at the 
beginning of this thread, the latest one is at 
http://westnet.com/~gsmith/content/postgresql/new-patch-checkpoint.txt 
I've been adjusting it to monitor the same data I think you need to refine 
your patch.  I believe the approach you're taking includes some 
assumptions that seem perfectly reasonable, but that my testing doesn't 
agree with.  There's nothing like measuring something to settle what's 
really going on, though, so that's what I've been focusing on.  I'd love 
to get some feedback on whether other people can replicate the kind of 
things I'm seeing.

The new code generates statistics about exactly what the background writer 
scan found during each round.  If there's been substantial write activity, 
it prints a log line when it recycles back to the beginning of the all 
scan, to help characterize what the buffer pool looked like during that 
scan from the perspective of the bgwriter.  Here's some sample log output 
from my underpowered laptop while running pgbench:

bgwriter scan all writes=16.6 MB (69.3%) pinned=11.7 MB (48.8%) LRU=7.7 MB (31.9%)
...
checkpoint required (wrote checkpoint_segments)
checkpoint buffers dirty=19.4 MB (80.8%) write=188.9 ms sync=4918.1 ms

Here 69% of the buffer cache contained dirty data, and 49% of the cache 
was both pinned and dirty.  During that same time period, the LRU write 
also wrote out a fair amount of data, operating on the 20% of the cache 
that was dirty but not pinned.  On my production server, where the 
background writer is turned way up to reduce checkpoint times, these 
numbers are even more extreme; almost everything that's dirty is also 
pinned during pgbench, and the LRU is lucky to find anything it can write 
as a result.

That patch is against the 8.2 codebase; now that I'm almost done I'm 
planning to move it to HEAD instead soon (where it will conflict 
considerably with your patch).  If you have an 8.2 configuration you can 
test with my patch applied, set log_min_messages = debug2, try it out, and 
see what you get when running pgbench for a while.  I think you'll 
discover some interesting and unexpected things.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: RFC: changing autovacuum_naptime semantics
Next
From: Josh Berkus
Date:
Subject: Re: who gets paid for this