On Tue, Jun 21, 2005 at 12:00:56PM -0700, Josh Berkus wrote:
> Folks,
>
> Going over some performance test results at OSDL, our single greatest
> performance issue seems to be checkpointing. Not matter how I fiddle
> with it, checkpoints seem to cost us 1/2 of our throughput while they're
> taking place. Overally, checkpointing costs us about 25% of our
> performance on OLTP workloads.
>
> Example: http://khack.osdl.org/stp/302671/results/0/
>
> Can we break down everything that happens during a checkpoint so that we
> can see where this huge cost is coming from? Checkpointing should be
> limited to fsyncing to disk and marking WAL files as recyclable, but there
> seems to be something more.
Not only you have to fsync the files; you have to write them before as
well. If the bgwriter is not able to keep up then at checkpoint time
there is a lot of writing to do. One idea is to fiddle with bgwriter
settings, or did you do that already? I see this for the URL above:
bgwriter_delay | 200bgwriter_maxpages | 100bgwriter_percent | 1
Maybe it should be more aggressive.
Another thing to blame is the dump-whole-pages-after-checkpoint
business. Maybe the load you are seeing is not completely during
checkpoint, but right after it as well. How do you tell from the
results that the checkpoint is complete?
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)