Re: Log levels for checkpoint/bgwriter monitoring - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Log levels for checkpoint/bgwriter monitoring |
Date | |
Msg-id | 200703271557.l2RFvNl28519@momjian.us Whole thread Raw |
In response to | Log levels for checkpoint/bgwriter monitoring (Greg Smith <gsmith@gregsmith.com>) |
Responses |
Re: Log levels for checkpoint/bgwriter monitoring
|
List | pgsql-hackers |
Is this patch ready? --------------------------------------------------------------------------- Greg Smith wrote: > I have a WIP patch that adds the main detail I have found I need to > properly tune checkpoint and background writer activity. I think it's > almost ready to submit (you can see the current patch against 8.2 at > http://www.westnet.com/~gsmith/content/postgresql/patch-checkpoint.txt ) > after making it a bit more human-readable. But I've realized that along > with that, I need some guidance in regards to what log level is > appropriate for this information. > > An example works better than explaining what the patch does: > > 2007-02-19 21:53:24.602 EST - DEBUG: checkpoint required (wrote > checkpoint_segments) > 2007-02-19 21:53:24.685 EST - DEBUG: checkpoint starting > 2007-02-19 21:53:24.705 EST - DEBUG: checkpoint flushing buffer pool > 2007-02-19 21:53:24.985 EST - DEBUG: checkpoint database fsync starting > 2007-02-19 21:53:42.725 EST - DEBUG: checkpoint database fsync complete > 2007-02-19 21:53:42.726 EST - DEBUG: checkpoint buffer flush dirty=8034 > write=279956 us sync=17739974 us > > Remember that "Load distributed checkpoint" discussion back in December? > You can see exactly how bad the problem is on your system with this log > style (this is from a pgbench run where it's postively awful--it really > does take over 17 seconds for the fsync to execute, and there are clients > that are hung the whole time waiting for it). > > I also instrumented the background writer. You get messages like this: > > 2007-02-19 21:58:54.328 EST - DEBUG: BGWriter Scan All - Written = 5/5 > Unscanned = 23/54 > > This shows that we wrote (5) the maximum pages we were allowed to write > (5) while failing to scan almost half (23) of the buffers we meant to look > at (54). By taking a look at this output while the system is under load, > I found I was able to do bgwriter optimization that used to take me days > of frustrating testing in hours. I've been waiting for a good guide to > bgwriter tuning since 8.1 came out. Once you have this, combined with > knowing how many buffers were dirty at checkpoint time because the > bgwriter didn't get to them in time (the number you want to minimize), the > tuning guide practically writes itself. > > So my question is...what log level should all this go at? Right now, I > have the background writer stuff adjusting its level dynamically based on > what happened; it logs at DEBUG2 if it hits the write limit (which should > be a rare event once you're tuned properly), DEBUG3 for writes that > scanned everything they were supposed to, and DEBUG4 if it scanned but > didn't find anything to write. The source of checkpoint information logs > at DEBUG1, the fsync/write info at DEBUG2. > > I'd like to move some of these up. On my system, I even have many of the > checkpoint messages logged at INFO (the source of the checkpoint and the > total write time line). It's a bit chatty, but when you get some weird > system pause issue it makes it easy to figure out if checkpoints were to > blame. Given how useful I feel some of these messages are to system > tuning, and to explaining what currently appears as inexplicable pauses, I > don't want them to be buried at DEBUG levels where people are unlikely to > ever see them (I think some people may be concerned about turning on > things labeled DEBUG at all). I am aware that I am too deep into this to > have an unbiased opinion at this point though, which is why I ask for > feedback on how to proceed here. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-hackers by date: