Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away - Mailing list pgsql-general

From Chris Withers
Subject Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
Date
Msg-id 55B28E58.1010105@simplistix.co.uk
Whole thread Raw
In response to Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away  (Jan Lentfer <Jan.Lentfer@web.de>)
Responses Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
List pgsql-general
On 24/07/2015 19:21, Jan Lentfer wrote:
>
>>> I've been doing some lightweight load testing with
>>> “pgbench -c8 -j8 -T10”
>>>
>>> When run locally on the postgres server I've testing, this gives
>>> around 5000tps
>>>
>>> When I do it from a server that has a 13ms ping latency, it drops to
>>> 37tps.
>>>
>>> This is using the default pgbench script, is it to be expected?
>>> If so, why?
>>>
>
>
>> That seems to be a large drop. On the other hand 13 ms is also like a
>> very large network latency. On LAN your usually in the sub ms area. So
>> going from e.g. 0.2 ms to 13ms is 65 fold decrease. What is the
>> network toplogy like?

10G between two colos, a switch at each end.

What's interesting is how it goes when adjusting the number for
threads/connections. As a baseline, I did one of each. As expected, this
gave around 10 tps, (7 statements in the standard file * 13ms latency
gives around 0.1s per transaction). This behaviour continued up to -c3
-j3 linearly achieving roughly 10tps per client/thread:

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 3
number of threads: 3
duration: 10 s
number of transactions actually processed: 317
latency average: 94.637 ms
tps = 31.494417 (including connections establishing)
tps = 31.668794 (excluding connections establishing)
statement latencies in milliseconds:
0.002016 \set nbranches 1 * :scale
0.000438 \set ntellers 10 * :scale
0.000379 \set naccounts 100000 * :scale
0.000489 \setrandom aid 1 :naccounts
0.000404 \setrandom bid 1 :nbranches
0.000413 \setrandom tid 1 :ntellers
0.000470 \setrandom delta -5000 5000
13.061975 BEGIN;
13.174287 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.127691 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
14.552413 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
14.109375 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE
bid = :bid;
13.113028 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.256063 END;

Now, at -c4 and -j4 the problem becomes apparent:

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 4
number of threads: 4
duration: 10 s
number of transactions actually processed: 362
latency average: 110.497 ms
tps = 35.912951 (including connections establishing)
tps = 36.111849 (excluding connections establishing)
statement latencies in milliseconds:
0.001917 \set nbranches 1 * :scale
0.000511 \set ntellers 10 * :scale
0.000384 \set naccounts 100000 * :scale
0.000525 \setrandom aid 1 :naccounts
0.000406 \setrandom bid 1 :nbranches
0.000472 \setrandom tid 1 :ntellers
0.000483 \setrandom delta -5000 5000
13.063624 BEGIN;
13.170928 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.122450 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
16.532138 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
28.090450 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE
bid = :bid;
13.112207 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.229887 END;

We've dropped down to 9 tps per client/thread, and it's the update
statements that are growing in time, here's the worst case:

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 10
duration: 10 s
number of transactions actually processed: 382
latency average: 261.780 ms
tps = 37.310918 (including connections establishing)
tps = 37.517192 (excluding connections establishing)
statement latencies in milliseconds:
0.001798 \set nbranches 1 * :scale
0.000437 \set ntellers 10 * :scale
0.000385 \set naccounts 100000 * :scale
0.000597 \setrandom aid 1 :naccounts
0.000369 \setrandom bid 1 :nbranches
0.000437 \setrandom tid 1 :ntellers
0.000401 \setrandom delta -5000 5000
13.064963 BEGIN;
13.192241 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE
aid = :aid;
13.121914 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
83.994516 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE
tid = :tid;
113.638228 UPDATE pgbench_branches SET bbalance = bbalance + :delta
WHERE bid = :bid;
13.133390 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
13.288079 END;

What on earth could be causing those updates to now take getting on for
an order of magnitude more than the latency to the server?

cheers,

Chris


pgsql-general by date:

Previous
From: Jan Lentfer
Date:
Subject: Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away
Next
From: Jeff Janes
Date:
Subject: Re: pgbench tps drop from 5000 to 37 going from localhost to a server 13ms away