On Fri, Nov 18, 2011 at 12:45 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> OK. Sorry for misunderstanding that. I haven't gotten around to a
> deep reading of the patch yet. :-( I based this on the test script
> you posted here (with slight modifications for my preferred
> directory structures):
>
> http://archives.postgresql.org/pgsql-hackers/2011-10/msg00605.php
>
> If I just drop the -S switch will I have a good test, or are there
> other adjustments I should make (besides increasing checkpoint
> segments)? (Well, for the SELECT-only test I didn't bother putting
> pg_xlog on a separate RAID 10 on it's own BBU controller as we
> normally would for this machine, I'll cover that, too.)
Yeah, I'd just drop -S. Make sure to use -c N -j N with pgbench, or
you'll probably not be able to saturate it. I've also had good luck
with wal_writer_delay=20ms, although if you have synchronous_commit=on
that might not matter, and it's much less important since Simon's
recent patch in that area went in.
What scale factor are you testing at?
>> It doesn't make any sense for PostgreSQL master to be using only
>> 50% of the CPU and leaving the rest idle on a lots-of-clients
>> SELECT-only test. That could easily happen on 9.1, but my lock
>> manager changes eliminated the only place where anything gets put
>> to sleep in that path (except for the emergency sleeps done by
>> s_lock, when a spinlock is really badly contended). So I'm
>> confused by these results. Are we sure that the processes are
>> being scheduled across all 32 physical cores?
>
> I think so. My take was that it was showing 32 of 64 *threads*
> active -- the hyperthreading funkiness. Is there something in
> particular you'd like me to check?
Not really, just don't understand the number.
>> At any rate, I do think it's likely that you're being bitten by
>> spinlock contention, but we'd need to do some legwork to verify
>> that and work out the details. Any chance you can run oprofile
>> (on either branch, don't really care) against the 32 client test
>> and post the results? If it turns out s_lock is at the top of the
>> heap, I can put together a patch to help figure out which spinlock
>> is the culprit.
>
> oprofile isn't installed on this machine. I'll take care of that
> and post results when I can.
OK.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company