Re: Spinlocks, yet again: analysis and proposed patches - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Spinlocks, yet again: analysis and proposed patches
Date
Msg-id 1130830352.8300.1563.camel@localhost.localdomain
Whole thread Raw
In response to Re: Spinlocks, yet again: analysis and proposed patches  (Mark Wong <markw@osdl.org>)
Responses Re: Spinlocks, yet again: analysis and proposed patches
List pgsql-hackers
On Mon, 2005-10-31 at 16:10 -0800, Mark Wong wrote:
> On Thu, 20 Oct 2005 23:03:47 +0100
> Simon Riggs <simon@2ndquadrant.com> wrote:
> 
> > On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote:
> > > > 
> > > > This isn't exactly elegant coding, but it provides a useful improvement
> > > > on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks
> > > > pretty darn stupid. But it does follow the CPU optimization handbook
> > > > advice and I did see a noticeable improvement in performance and a
> > > > reduction in context switching.
> > 
> > > > I'm not in a position to try this again now on 8.1beta, but I'd welcome
> > > > a performance test result from anybody that is. I'll supply a patch
> > > > against 8.1beta for anyone wanting to test this.
> > > 
> > > Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which
> > > I've just set up and probably needs a bit of tuning.  I don't see much
> > > difference but I'm wondering if the cacheline sizes are dramatically
> > > different from Intel/AMD processors.  I still need to take a closer look
> > > to make sure I haven't grossly mistuned anything, but I'll let everyone
> > > take a look:
> > 
> > Well, the Power 5 architecture probably has the lowest overall memory
> > delay you can get currently so in some ways that would negate the
> > effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's
> > clear the patch isn't significantly better (like it was with 8.0 when we
> > tried this on the 8-way Itanium in Feb).
> > 
> > > cvs 20051013
> > > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/19/
> > > 2501 notpm
> > > 
> > > cvs 20051013 w/ lw.patch
> > > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/20/
> > > 2519 notpm
> > 
> > Could you re-run with wal_buffers = 32 ? (Without patch) Thanks
> 
> Ok, sorry for the delay.  I've bumped up the wal_buffers to 2048 and
> redid the disk layout.  Here's where I'm at now:
> 
> cvs 20051013
> http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/40/
> 3257 notpm
> 
> cvs 20051013 w/ lw.patch
> http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/42/
> 3285 notpm
> 
> Still not much of a difference with the patch.  A quick glance over the
> iostat data suggests I'm still not i/o bound, but the i/o wait is rather
> high according to vmstat.  Will try to see if there's anything else
> obvious to get the load up higher.

OK, thats fine. I'm glad there's some gain, but not much yet. I think we
should leave out doing any more tests on lw.patch for now.

Concerned about the awful checkpointing. Can you bump wal_buffers to
8192 just to make sure? Thats way too high, but just to prove it.

We need to rdeuce the number of blocks to be written at checkpoint.
bgwriter_all_maxpages   5      ->  15bgwriter_all_percent    0.333bgwriter_delay          200  bgwriter_lru_maxpages
5       ->  7bgwriter_lru_percent    1
 
shared_buffers         set lower to 100000(which should cause some amusement on-list)

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Stefan Kaltenbrunner
Date:
Subject: Re: 8.1 Release Candidate 1 Coming ...
Next
From: strk@refractions.net
Date:
Subject: FreeBSD broke with autoconf-based build