Thread: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Chris Withers
Date:
Hi all, I've been doing some lightweight load testing with “pgbench -c8 -j8 -T10” When run locally on the postgres server I've testing, this gives around 5000tps When I do it from a server that has a 13ms ping latency, it drops to 37tps. This is using the default pgbench script, is it to be expected? If so, why? cheers, Chris
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Jan Lentfer
Date:
That seems to be a large drop. On the other hand 13 ms is also like a very large network latency. On LAN your usually inthe sub ms area. So going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the network toplogy like? Jan Von meinem iPad gesendet > Am 24.07.2015 um 18:59 schrieb Chris Withers <chris@simplistix.co.uk>: > > Hi all, > > I've been doing some lightweight load testing with > “pgbench -c8 -j8 -T10” > > When run locally on the postgres server I've testing, this gives around 5000tps > > When I do it from a server that has a 13ms ping latency, it drops to 37tps. > > This is using the default pgbench script, is it to be expected? > If so, why? > > cheers, > > Chris > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Jan Lentfer
Date:
Am 24.07.2015 um 18:59 schrieb Chris Withers <chris@simplistix.co.uk>:Hi all,I've been doing some lightweight load testing with“pgbench -c8 -j8 -T10”When run locally on the postgres server I've testing, this gives around 5000tpsWhen I do it from a server that has a 13ms ping latency, it drops to 37tps.This is using the default pgbench script, is it to be expected?If so, why?
Am 24.07.2015 um 20:06 schrieb Jan Lentfer <Jan.Lentfer@web.de>:
That seems to be a large drop. On the other hand 13 ms is also like a very large network latency. On LAN your usually in the sub ms area. So going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the network toplogy like?
Sorry for top posting my first response. Always happens when I am on the iPad.
I just checked on my home setup. Ping latency on GBit crossover connection is around 0.3 ms, while pinging localhost is around 0.05 ms. You are at 13ms. So that is a 260 fold decrease, which is in about the same area as what you see with pgbench. Of course with pgbench the actual payload comes into account on top.
Jan
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Chris Withers
Date:
On 24/07/2015 19:21, Jan Lentfer wrote: > >>> I've been doing some lightweight load testing with >>> “pgbench -c8 -j8 -T10” >>> >>> When run locally on the postgres server I've testing, this gives >>> around 5000tps >>> >>> When I do it from a server that has a 13ms ping latency, it drops to >>> 37tps. >>> >>> This is using the default pgbench script, is it to be expected? >>> If so, why? >>> > > >> That seems to be a large drop. On the other hand 13 ms is also like a >> very large network latency. On LAN your usually in the sub ms area. So >> going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the >> network toplogy like? 10G between two colos, a switch at each end. What's interesting is how it goes when adjusting the number for threads/connections. As a baseline, I did one of each. As expected, this gave around 10 tps, (7 statements in the standard file * 13ms latency gives around 0.1s per transaction). This behaviour continued up to -c3 -j3 linearly achieving roughly 10tps per client/thread: starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 3 number of threads: 3 duration: 10 s number of transactions actually processed: 317 latency average: 94.637 ms tps = 31.494417 (including connections establishing) tps = 31.668794 (excluding connections establishing) statement latencies in milliseconds: 0.002016 \set nbranches 1 * :scale 0.000438 \set ntellers 10 * :scale 0.000379 \set naccounts 100000 * :scale 0.000489 \setrandom aid 1 :naccounts 0.000404 \setrandom bid 1 :nbranches 0.000413 \setrandom tid 1 :ntellers 0.000470 \setrandom delta -5000 5000 13.061975 BEGIN; 13.174287 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 13.127691 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 14.552413 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 14.109375 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 13.113028 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 13.256063 END; Now, at -c4 and -j4 the problem becomes apparent: starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 10 s number of transactions actually processed: 362 latency average: 110.497 ms tps = 35.912951 (including connections establishing) tps = 36.111849 (excluding connections establishing) statement latencies in milliseconds: 0.001917 \set nbranches 1 * :scale 0.000511 \set ntellers 10 * :scale 0.000384 \set naccounts 100000 * :scale 0.000525 \setrandom aid 1 :naccounts 0.000406 \setrandom bid 1 :nbranches 0.000472 \setrandom tid 1 :ntellers 0.000483 \setrandom delta -5000 5000 13.063624 BEGIN; 13.170928 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 13.122450 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 16.532138 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 28.090450 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 13.112207 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 13.229887 END; We've dropped down to 9 tps per client/thread, and it's the update statements that are growing in time, here's the worst case: starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 10 number of threads: 10 duration: 10 s number of transactions actually processed: 382 latency average: 261.780 ms tps = 37.310918 (including connections establishing) tps = 37.517192 (excluding connections establishing) statement latencies in milliseconds: 0.001798 \set nbranches 1 * :scale 0.000437 \set ntellers 10 * :scale 0.000385 \set naccounts 100000 * :scale 0.000597 \setrandom aid 1 :naccounts 0.000369 \setrandom bid 1 :nbranches 0.000437 \setrandom tid 1 :ntellers 0.000401 \setrandom delta -5000 5000 13.064963 BEGIN; 13.192241 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 13.121914 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 83.994516 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 113.638228 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 13.133390 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 13.288079 END; What on earth could be causing those updates to now take getting on for an order of magnitude more than the latency to the server? cheers, Chris
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Jeff Janes
Date:
On Fri, Jul 24, 2015 at 12:13 PM, Chris Withers <chris@simplistix.co.uk> wrote:
On 24/07/2015 19:21, Jan Lentfer wrote:I've been doing some lightweight load testing with
“pgbench -c8 -j8 -T10”
When run locally on the postgres server I've testing, this gives
around 5000tps
When I do it from a server that has a 13ms ping latency, it drops to
37tps.
This is using the default pgbench script, is it to be expected?
If so, why?
...
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
This is your problem. There is only one row in the pgbench_branch table, and every transaction has to update that one row. This is inherently a seriaized event.
You trying to overcome the latency by cramming more stuff through the pipe at a time, but what you are cramming through must be done in single-file.
One solution is to just use a large scale on the benchmark so that they upate random pgbench_branch rows, rather than all updating the same row:
pgbench -i -s50
Alternatively, you could write a custom file so that all 7 commands are sent down in one packet. That way the releasing COMMIT shows up at the same time as the locking UPDATE does, rather than having 2 more round trips between them. But this would violate the spirit of the benchmark, as presumably you are expected to inspect the results of the SELECT statement before proceeding with the rest of the transaction.
Or you could write a custom benchmark which more closely resembles whatever your true workload will be.
Cheers,
Jeff
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Chris Withers
Date:
On 24/07/2015 22:51, Jeff Janes wrote:
cheers,
Chris
Indeed it was!starting vacuum...end.transaction type: TPC-B (sort of)
scaling factor: 1This is your problem. There is only one row in the pgbench_branch table, and every transaction has to update that one row. This is inherently a seriaized event.
With a scale of 1000, everything except the END took roughly the latency time. Interestingly, the END still seems to take more, when threads/clients are really ramped up (100 vs 8). Why would that be?One solution is to just use a large scale on the benchmark so that they upate random pgbench_branch rows, rather than all updating the same row:pgbench -i -s50
How would you restructure the sql so as the make that happen?Alternatively, you could write a custom file so that all 7 commands are sent down in one packet.
cheers,
Chris
Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
From
Jeff Janes
Date:
On Mon, Jul 27, 2015 at 3:19 AM, Chris Withers <chris@simplistix.co.uk> wrote:
On 24/07/2015 22:51, Jeff Janes wrote:Indeed it was!starting vacuum...end.transaction type: TPC-B (sort of)
scaling factor: 1This is your problem. There is only one row in the pgbench_branch table, and every transaction has to update that one row. This is inherently a seriaized event.With a scale of 1000, everything except the END took roughly the latency time. Interestingly, the END still seems to take more, when threads/clients are really ramped up (100 vs 8). Why would that be?One solution is to just use a large scale on the benchmark so that they upate random pgbench_branch rows, rather than all updating the same row:pgbench -i -s50
Could it be the fsync time? Presumably your server has a BBU on it, but perhaps the transaction rate at the high scale factor is high enough to overwhelm it. Are you sure you don't see the same timing when you run pgbench locally?
How would you restructure the sql so as the make that happen?Alternatively, you could write a custom file so that all 7 commands are sent down in one packet.
Just make a file with all the commands in it, and then remove all the newlines from the non-backslash commands so that they are all on the same line (separated by semicolons). Leave the backslash commands on their own lines. Then use the -f switch to pgbench, giving it the file you just made. Also, make sure you give it '-s 1000' as well, because when you use the -f option pgbench does not auto-detect the scale factor.
Cheers,
Jeff