Re: Spinlocks, yet again: analysis and proposed patches - Mailing list pgsql-hackers

From Mark Wong
Subject Re: Spinlocks, yet again: analysis and proposed patches
Date
Msg-id 200511042110.jA4LAWnO009682@smtp.osdl.org
Whole thread Raw
In response to Re: Spinlocks, yet again: analysis and proposed patches  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Thu, 03 Nov 2005 18:29:09 +0000
Simon Riggs <simon@2ndquadrant.com> wrote:

> On Thu, 2005-11-03 at 08:03 -0800, Mark Wong wrote:
> > On Tue, 01 Nov 2005 07:32:32 +0000
> > Simon Riggs <simon@2ndquadrant.com> wrote:
> > > Concerned about the awful checkpointing. Can you bump wal_buffers to
> > > 8192 just to make sure? Thats way too high, but just to prove it.
> > > 
> > > We need to rdeuce the number of blocks to be written at checkpoint.
> > > 
> > >  bgwriter_all_maxpages   5      ->  15
> > >  bgwriter_all_percent    0.333
> > >  bgwriter_delay          200  
> > >  bgwriter_lru_maxpages   5        ->  7
> > >  bgwriter_lru_percent    1
> > > 
> > >  shared_buffers         set lower to 100000
> > >  (which should cause some amusement on-list)
> > 
> > 
> > Okay, here goes, all with the same source base w/ the lw.patch:
> > 
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/44/
> > only increased wal_buffers to 8192 from 2048
> > 3242 notpm
> 
> That looks to me like a clear negative effect from increasing
> wal_buffers. Try putting it back down to 1024.
> Looks like we need to plug that gap.
> 
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/43/
> > only increased bgwriter_all_maxpages to 15, and bgwriter_lru_maxpages to 7
> > 3019 notpm (but more interesting graph)
> 
> Man that sucks. What the heck is happening there? Hackers - if you
> watching you should see this graph - it shows some very poor behaviour.
> 
> I'm not happy with that performance at all.... any chance you could re-
> run that exact same test to see if we can get that repeatably?
> 
> I see you have 
> vm.dirty_writeback_centisecs = 0
> 
> which pretty much means we aren't ever writing to disk by the pdflush
> daemons, even when the bgwriter is active.
> 
> Could we set the bgwriter stuff back to default and try 
> vm.dirty_writeback_centisecs = 500

http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/47/
3309 notpm
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/45/
> > Same as the previously listen run with hared_buffers lowered to 10000
> > 2503 notpm
> 
> Sorry, that was 100,000 not 10,000. 

Oops!
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/46/
2794 notpm

> Looks like we need dates on the log_line_prefix so we can check the
> logs.

Oops again!  I didn't check to make sure I had set this correctly before
I ran the last two tests, I'll get on it.
> ...not sure about the oprofile results. Seems to show CreateLWLocks
> being as high as xlog_insert, which is mad. Either that shows startup
> time is excessive, or it means the oprofile timing range is too short.
> Not sure which.

Yeah, we've seen this before.  I think I'll have to try pulling the
oprofile cvs code to see if there's any improvement.  I've been working
with oprofile-0.9.1.

Mark


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Reducing the overhead of NUMERIC data
Next
From: Tom Lane
Date:
Subject: Re: Seeing context switch storm with 10/13 snapshot of