Hi Martijn, hi Greg,
thanks you very much for your help. We finally got rid of these annoying
spikes.
First we tried to set
checkpoint_segments = 3 # before 16
checkpoint_timeout = 5min # before: 60min
which didn't really help.
we had the same spikes but more often. Then we tried to lower
dirty_writeback_centisecs from 500 to 100
This helped a little bit, but we investigated this problem further.
we monitored "Dirty" memory from /proc/meminfo and we saw very long query
durations (>5s, sometimes 10s and more) correlating with the kernel writing
out Dirty buffer. When we saw a massive reduction in "Dirty" memory, there were
spikes in the query duration. As long as the kernel didn't write out the dirty
memory, everything did run fine.
So finally we tried
echo 0 > /proc/sys/vm/dirty_background_ratio
and now everything runs very smooth. We see a few longer query durations over
2 seconds but no more spikes of 5 or 10 seconds.
Our average response time from our our tomcat servers suddenly dropped from
300ms to 100ms. Great!!
We know that our limitation is cheap disks, but with /dirty_background_ratio =
0 you really have big advantages and much better performance.
So for further reference for other people reading this thread, I really
recommend trying this out.
best regards
Janning
On Friday 11 June 2010 21:48:54 Martijn van Oosterhout wrote:
> On Thu, Jun 10, 2010 at 04:00:54PM -0400, Greg Smith wrote:
> >> 5. Does anybody know if I can set dirty_background_ratio to 0.5? As we
> >> have 12 GB RAM and rather slow disks 0,5% would result in a maximum of
> >> 61MB dirty pages.
> >
> > Nope. Linux has absolutely terrible controls for this critical
> > performance parameter. The sort of multi-second spikes you're seeing
> > are extremely common and very difficult to get rid of.
>
> Another relevent parameter is /proc/sys/vm/dirty_writeback_centisecs.
> By default linux only wakes up once every 5 seconds to check if there
> is stuff to write out. I have found that reducing this tends to smooth
> out bursty spikes. However, see:
>
> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>
> which indicates that kernel may try to defeat you here...
>
> Have a nice day,