Folks, sorry, I'm outpassed little bit by the events :-))
I've finished tests with PREPARE/EXECUTE - it's much faster of course,
and the max TSP is 15.000 now on 24 cores! - I've done various tests
to see where is the limit bottleneck may be present - it's more likely
something timer or interrupt based, etc. Nothing special via DTrace,
or probably it'll say you more things then me, but for a 10sec period
it's quite small wait time:
# lwlock_wait_8.4.d `pgrep -n postgres`
Lock Id Mode Count
FirstBufMappingLock Exclusive 1
FirstLockMgrLock Exclusive 1
BufFreelistLock Exclusive 3
FirstBufMappingLock Shared 4
FirstLockMgrLock Shared 4
Lock Id Mode Combined Time (ns)
FirstLockMgrLock Exclusive 803700
BufFreelistLock Exclusive 3001600
FirstLockMgrLock Shared 4586600
FirstBufMappingLock Exclusive 6283900
FirstBufMappingLock Shared 21792900
On the same time those lock waits are appearing only on 24 or 32 cores.
I'll plan to replay this case on the bigger server (64 cores or more)
- it'll be much more evident if the problem is in locks.
Currently I'm finishing my report with all data all of you asked
(system graphs, pgsql, and other). I'll publish it on my web site and
send you a link.
Rgds,
-Dimitri
On 5/14/09, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Tue, 2009-05-12 at 14:28 +0200, Dimitri wrote:
>
>> As problem I'm considering a scalability issue on Read-Only workload -
>> only selects, no disk access, and if on move from 8 to 16 cores we
>> gain near 100%, on move from 16 to 32 cores it's only 10%...
>
> Dimitri,
>
> Will you be re-running the Read-Only tests?
>
> Can you run the Dtrace script to assess LWlock contention during the
> run?
>
> Would you re-run the tests with a patch?
>
> Thanks,
>
> --
> Simon Riggs www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
>
>