On Tue, Jul 28, 2009 at 7:21 PM, Greg Smith<gsmith@gregsmith.com> wrote:
> On Tue, 28 Jul 2009, Scott Marlowe wrote:
>
>> Just FYI, I ran the same basic test but with -c 10 since -c shouldn't
>> really be greater than -s
>
> That's only true if you're running the TPC-B-like or other write tests,
> where access to the small branches table becomes a serious hotspot for
> contention. The select-only test has no such specific restriction as it
> only operations on the big accounts table. Often peak throughput is closer
> to a very small multiple on the number of cores though, and possibly even
> clients=cores, presumably because it's more efficient to approximately peg
> one backend per core rather than switch among more than one on each--reduced
> L1 cache contention etc. That's the behavior you measured when your test
> showed better results with c=10 than c=16 on a 8 core system, rather than
> suffering less from the "c must be < s" contention limitation.
>
> Sadly I don't have or expect to have a W5580 in the near future though, the
> X5550 @ 2.67GHz is the bang for the buck sweet spot right now and
> accordingly that's what I have in the lab at Truviso. As Merlin points out,
> that's still plenty to spank any select-only pgbench results I've ever seen.
> The multi-threaded pgbench batch submitted by Itagaki Takahiro recently is
> here just in time to really exercise these new processors properly.
Can I trouble you for a single client run, say:
pgbench -S -c 1 -t 250000
I'd like to see how much of your improvement comes from SMT and how
much comes from general improvements to the cpu...
merlin