On Mon, 11 May 2009, Dimitri wrote:
> I've tried to reduce checkpoint timeout from 5min to 30sec - it helped,
> throughput is more stable now, but instead of big waves I have now short
> waves anyway..
Tuning for very tiny checkpoints all of the time is one approach here.
The other is to push up checkpoint_segments (done in your case),
checkpoint_timeout, and checkpoint_completion_target to as high as you
can, in order to spread the checkpoint period over as much time as
possible. Reducing shared_buffers can also help in both cases, you've set
that to an extremely high value.
http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm is a
long discussion of just this topic, if you saw a serious change by
adjusting checkpoint_timeout than further experimentation in this area is
likely to help you out.
You might also want to look at the filesystem parameters you're using
under Solaris. ZFS in particular can cache more writes than you may
expect, which can lead to that all getting pushed out at the very end of
checkpoint time. That may very well be the source of your "waves", on a
system with 64GB of RAM for all we know *every* write you're doing between
checkpoints is being buffered until the fsyncs at the checkpoint end.
There were a couple of sessions at PG East last year that mentioned this
area, I put a summary of suggestions and links to more detail at
http://notemagnet.blogspot.com/2008/04/conference-east-08-and-solaris-notes.html
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD