On Sun, Feb 2, 2014 at 6:00 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-02-01 19:47:29 -0800, Peter Geoghegan wrote:
>> Here are the results of a benchmark on Nathan Boley's 64-core, 4
>> socket server: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
>
> That's interesting. The maximum number of what you see here (~293125)
> is markedly lower than what I can get.
>
> ... poke around ...
>
> Hm, that's partially because you're using pgbench without -M prepared if
> I see that correctly. The bottleneck in that case is primarily memory
> allocation. But even after that I am getting higher
> numbers: ~342497.
>
> Trying to nail down the differnce it oddly seems to be your
> max_connections=80 vs my 100. The profile in both cases is markedly
> different, way much more spinlock contention with 80. All in
> Pin/UnpinBuffer().
I updated this benchmark, with your BufferDescriptors alignment patch
[1] applied on top of master (while still not using "-M prepared" in
order to keep the numbers comparable). So once again, that's:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
It made a bigger, fairly noticeable difference, but not so big a
difference as you describe here. Are you sure that you saw this kind
of difference with only 64 clients, as you mentioned elsewhere [1]
(perhaps you fat-fingered [1] -- "-cj" is ambiguous)? Obviously
max_connections is still 80 in the above. Should I have gone past 64
clients to see the problem? The best numbers I see with the [1] patch
applied on master is only ~327809 for -S 10 64 clients. Perhaps I've
misunderstood.
[1] "Misaligned BufferDescriptors causing major performance problems
on AMD": http://www.postgresql.org/message-id/20140202151319.GD32123@awork2.anarazel.de
--
Peter Geoghegan