On Fri, Nov 18, 2011 at 12:03 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> Then again, is this a regular pgbench test or is this SELECT-only?
>
> SELECT-only
Ah, OK. I would not expect flexlocks to help with that; Pavan's patch
might, though.
>> Can you by any chance check top or vmstat during the 32-client
>> test and see what percentage you have of user time/system
>> time/idle time?
>
> You didn't say whether you wanted master or flexlock, but it turned
> out that any difference was way too far into the noise to show.
> They both looked like this:
>
> procs --------------memory------------- ---swap-- -----io----
> r b swpd free buff cache si so bi bo
> ----system---- -----cpu------
> in cs us sy id wa st
> 38 0 352 1157400 207177020 52360472 0 0 0 16
> 13345 1190230 40 7 53 0 0
> 37 0 352 1157480 207177020 52360472 0 0 0 0
> 12953 1263310 40 8 52 0 0
> 36 0 352 1157484 207177020 52360472 0 0 0 0
> 13411 1233365 38 7 54 0 0
> 37 0 352 1157476 207177020 52360472 0 0 0 0
> 12780 1193575 41 7 51 0 0
>
> Keep in mind that while there are really 32 cores, the cpu
> percentages seem to be based on the "threads" from hyperthreading.
> Top showed pgbench (running on the same machine) as eating a pretty
> steady 5.2 of the cores, leaving 26.8 cores to actually drive the 32
> postgres processes.
It doesn't make any sense for PostgreSQL master to be using only 50%
of the CPU and leaving the rest idle on a lots-of-clients SELECT-only
test. That could easily happen on 9.1, but my lock manager changes
eliminated the only place where anything gets put to sleep in that
path (except for the emergency sleeps done by s_lock, when a spinlock
is really badly contended). So I'm confused by these results. Are we
sure that the processes are being scheduled across all 32 physical
cores?
At any rate, I do think it's likely that you're being bitten by
spinlock contention, but we'd need to do some legwork to verify that
and work out the details. Any chance you can run oprofile (on either
branch, don't really care) against the 32 client test and post the
results? If it turns out s_lock is at the top of the heap, I can put
together a patch to help figure out which spinlock is the culprit.
Anyway, this is probably a digression as it relates to FlexLocks:
those are not optimizing for a read-only workload.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company