On Thu, 03 Nov 2005 18:29:09 +0000
Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2005-11-03 at 08:03 -0800, Mark Wong wrote:
> > On Tue, 01 Nov 2005 07:32:32 +0000
> > Simon Riggs <simon@2ndquadrant.com> wrote:
> > > Concerned about the awful checkpointing. Can you bump wal_buffers to
> > > 8192 just to make sure? Thats way too high, but just to prove it.
> > >
> > > We need to rdeuce the number of blocks to be written at checkpoint.
> > >
> > > bgwriter_all_maxpages 5 -> 15
> > > bgwriter_all_percent 0.333
> > > bgwriter_delay 200
> > > bgwriter_lru_maxpages 5 -> 7
> > > bgwriter_lru_percent 1
> > >
> > > shared_buffers set lower to 100000
> > > (which should cause some amusement on-list)
> >
> >
> > Okay, here goes, all with the same source base w/ the lw.patch:
> >
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/44/
> > only increased wal_buffers to 8192 from 2048
> > 3242 notpm
>
> That looks to me like a clear negative effect from increasing
> wal_buffers. Try putting it back down to 1024.
> Looks like we need to plug that gap.
>
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/43/
> > only increased bgwriter_all_maxpages to 15, and bgwriter_lru_maxpages to 7
> > 3019 notpm (but more interesting graph)
>
> Man that sucks. What the heck is happening there? Hackers - if you
> watching you should see this graph - it shows some very poor behaviour.
>
> I'm not happy with that performance at all.... any chance you could re-
> run that exact same test to see if we can get that repeatably?
>
> I see you have
> vm.dirty_writeback_centisecs = 0
>
> which pretty much means we aren't ever writing to disk by the pdflush
> daemons, even when the bgwriter is active.
>
> Could we set the bgwriter stuff back to default and try
> vm.dirty_writeback_centisecs = 500
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/47/
3309 notpm
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/45/
> > Same as the previously listen run with hared_buffers lowered to 10000
> > 2503 notpm
>
> Sorry, that was 100,000 not 10,000.
Oops!
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/46/
2794 notpm
> Looks like we need dates on the log_line_prefix so we can check the
> logs.
Oops again! I didn't check to make sure I had set this correctly before
I ran the last two tests, I'll get on it.
> ...not sure about the oprofile results. Seems to show CreateLWLocks
> being as high as xlog_insert, which is mad. Either that shows startup
> time is excessive, or it means the oprofile timing range is too short.
> Not sure which.
Yeah, we've seen this before. I think I'll have to try pulling the
oprofile cvs code to see if there's any improvement. I've been working
with oprofile-0.9.1.
Mark