Re: [HACKERS] LWLock optimization for multicore Power machines - Mailing list pgsql-hackers

From Bernd Helmle
Subject Re: [HACKERS] LWLock optimization for multicore Power machines
Date
Msg-id 1486995395.2959.11.camel@oopsware.de
Whole thread Raw
In response to Re: [HACKERS] LWLock optimization for multicore Power machines  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] LWLock optimization for multicore Power machines  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Am Samstag, den 11.02.2017, 15:42 +0300 schrieb Alexander Korotkov:
> Thus, I see reasons why in your tests absolute results are lower than
> in my
> previous tests.
> 1.  You use 28 physical cores while I was using 32 physical cores.
> 2.  You run tests in PowerVM while I was running test on bare metal.
> PowerVM could have some overhead.
> 3.  I guess you run pgbench on the same machine.  While in my tests
> pgbench
> was running on another node of IBM E880.
> 

Yeah, pgbench was running locally. Maybe we can get some resources to
run them remotely. Interesting side note: If you run a second postgres
instance with the same pgbench in parallel, you get nearly the same
transaction throughput as a single instance.

Short side note:

If you run two Postgres instances concurrently with the same pgbench
parameters, you get nearly the same transaction throughput for both
instances each as when running against a single instance, e.g.


- single

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 112
duration: 300 s
number of transactions actually processed: 121523797
latency average = 0.276 ms
latency stddev = 0.096 ms
tps = 405075.282309 (including connections establishing)
tps = 405114.299174 (excluding connections establishing)

instance-1/instance-2 concurrently run:

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 56
duration: 300 s
number of transactions actually processed: 120645351
latency average = 0.278 ms
latency stddev = 0.158 ms
tps = 402148.536087 (including connections establishing)
tps = 402199.952824 (excluding connections establishing)

transaction type: <builtin: select only>
scaling factor: 1000
query mode: prepared
number of clients: 112
number of threads: 56
duration: 300 s
number of transactions actually processed: 121959772
latency average = 0.275 ms
latency stddev = 0.110 ms
tps = 406530.139080 (including connections establishing)
tps = 406556.658638 (excluding connections establishing)

So it looks like the machine has plenty of power, but PostgreSQL is
limiting somewhere.

> Therefore, having lower absolute numbers in your tests, win of LWLock
> optimization is also lower.  That is understandable.  But win of
> LWLock
> optimization is clearly visible definitely exceeds variation.
> 
> I think it would make sense to run more kinds of tests.  Could you
> try set
> of tests provided by Tomas Vondra?
> If even we wouldn't see win some of the tests, it would be still
> valuable
> to see that there is no regression there.

Unfortunately there are some test for AIX scheduled, which will assign
resources to that LPAR...i've just talked to the people responsible for
the machine and we can get more time for the Linux tests ;)




pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: [HACKERS] VOPS: vectorized executor for Postgres: how to speedup OLAP queriesmore than 10 times without changing anything in Postgres executor
Next
From: Kyle Gearhart
Date:
Subject: Re: [HACKERS] libpq Alternate Row Processor