Re: Spinlocks, yet again: analysis and proposed patches - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Spinlocks, yet again: analysis and proposed patches |
Date | |
Msg-id | 1130830352.8300.1563.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Spinlocks, yet again: analysis and proposed patches (Mark Wong <markw@osdl.org>) |
Responses |
Re: Spinlocks, yet again: analysis and proposed patches
|
List | pgsql-hackers |
On Mon, 2005-10-31 at 16:10 -0800, Mark Wong wrote: > On Thu, 20 Oct 2005 23:03:47 +0100 > Simon Riggs <simon@2ndquadrant.com> wrote: > > > On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote: > > > > > > > > This isn't exactly elegant coding, but it provides a useful improvement > > > > on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks > > > > pretty darn stupid. But it does follow the CPU optimization handbook > > > > advice and I did see a noticeable improvement in performance and a > > > > reduction in context switching. > > > > > > I'm not in a position to try this again now on 8.1beta, but I'd welcome > > > > a performance test result from anybody that is. I'll supply a patch > > > > against 8.1beta for anyone wanting to test this. > > > > > > Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which > > > I've just set up and probably needs a bit of tuning. I don't see much > > > difference but I'm wondering if the cacheline sizes are dramatically > > > different from Intel/AMD processors. I still need to take a closer look > > > to make sure I haven't grossly mistuned anything, but I'll let everyone > > > take a look: > > > > Well, the Power 5 architecture probably has the lowest overall memory > > delay you can get currently so in some ways that would negate the > > effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's > > clear the patch isn't significantly better (like it was with 8.0 when we > > tried this on the 8-way Itanium in Feb). > > > > > cvs 20051013 > > > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/19/ > > > 2501 notpm > > > > > > cvs 20051013 w/ lw.patch > > > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/20/ > > > 2519 notpm > > > > Could you re-run with wal_buffers = 32 ? (Without patch) Thanks > > Ok, sorry for the delay. I've bumped up the wal_buffers to 2048 and > redid the disk layout. Here's where I'm at now: > > cvs 20051013 > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/40/ > 3257 notpm > > cvs 20051013 w/ lw.patch > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/42/ > 3285 notpm > > Still not much of a difference with the patch. A quick glance over the > iostat data suggests I'm still not i/o bound, but the i/o wait is rather > high according to vmstat. Will try to see if there's anything else > obvious to get the load up higher. OK, thats fine. I'm glad there's some gain, but not much yet. I think we should leave out doing any more tests on lw.patch for now. Concerned about the awful checkpointing. Can you bump wal_buffers to 8192 just to make sure? Thats way too high, but just to prove it. We need to rdeuce the number of blocks to be written at checkpoint. bgwriter_all_maxpages 5 -> 15bgwriter_all_percent 0.333bgwriter_delay 200 bgwriter_lru_maxpages 5 -> 7bgwriter_lru_percent 1 shared_buffers set lower to 100000(which should cause some amusement on-list) Best Regards, Simon Riggs
pgsql-hackers by date: