Thread: pgbench calculates summary numbers a wrong way.

pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

17 September 2020, 08:41:54

Hello.

While running pgbench, I found tps excluding connection somewhat
strange. To emphasize the behavior, I inserted some sleep at the end
of doConnect() and ran pgbench with several times.

$ pgbench -T 5 -j 1 -c 4  (with 1s sleep in doConnect())
>   number of transactions actually processed: 409
!   latency average = 48.980 ms
>   tps = 81.665671 (including connections establishing)
>   tps = 102.323285 (excluding connections establishing)

$ pgbench -T 5 -j 4 -c 4  (with 1s sleep in doConnect())
>   number of transactions actually processed: 1628
!   latency average = 12.303 ms
>   tps = 325.123448 (including connections establishing)
>   tps = 422.671015 (excluding connections establishing)

Why the latencies are so different?  If I tell pgbench to measure the
latencies, it is about 8-9ms in both cases.

$ pgbench -T 5 -j 1 -c 4 -L100  (with 1s sleep in doConnect())
>   number of transactions actually processed: 419
>   number of transactions above the 100.0 ms latency limit: 0/419 (0.000 %)
!   latency average = 8.984 ms
>   latency stddev = 3.259 ms
>   tps = 83.633694 (including connections establishing)
>   tps = 104.909130 (excluding connections establishing)

$ pgbench -T 5 -j 4 -c 4 -L100  (with 1s sleep in doConnect())
>   number of transactions actually processed: 2047
>   number of transactions above the 100.0 ms latency limit: 0/2047 (0.000 %)
!   latency average = 7.807 ms
>   latency stddev = 2.916 ms
>   tps = 408.677024 (including connections establishing)
>   tps = 511.837509 (excluding connections establishing)

That's reasonable. So the latency estimation is apparently a kind of
bogus.  It seems to me that the calculation is wrong to use
time_include. I think it should use time_exclude instead.

$ pgbench -T 5 -j 1 -c 4  (with 1s sleep in doConnect())
>   number of transactions actually processed: 407
!   latency average = 11.674 ms
>   tps = 81.262992 (including connections establishing)
>   tps = 101.724507 (excluding connections establishing)
> 

$ pgbench -T 5 -j 4 -c 4  (with 1s sleep in doConnect())
>   number of transactions actually processed: 1718
!   latency average = 11.674 ms
>   tps = 342.652333 (including connections establishing)
>   tps = 428.850338 (excluding connections establishing)

Looks far better, but "tps excluding connection establishing" looks
strange. The reason seems to me threadRun() counts connection time a
wrong way. thread->conn_time is assumed to be the sum of the time took
by each client until it runs the first transaction.  If each
connection took 1 second, conn_time of a thread that runs three
clients is calculated as 3 seconds. But in reality, if each connection
took 1 second, all clients on the thread waits for 3 seconds before
starting a transaction so it should be 3s * 3(clients/thread) = 9
seconds.

We could multiply conn_time by nstate on each thread, but the same
happens for -C (is_connect) mode. When a client took 1 second to
establish a connection, no other clients can do anything meanwhile.
Somewhat different from the !is_connect case precisely, but basically
the same at the point that making a connection of a client actually
blocks all other clients on the same thread.

Therefore I changed nclients in the calculation of time_exclude to
nthreads.

$ pgbench -T 5 -j 1 -c 4  (with 1s sleep in doConnect())
>  number of transactions actually processed: 399
>  latency average = 9.402 ms
>  tps = 79.671581 (including connections establishing)
>  tps = 425.460179 (excluding connections establishing)

$ pgbench -T 5 -j 4 -c 4 (with 1s sleep in doConnect())
>  number of transactions actually processed: 1903
>  latency average = 8.330 ms
>  tps = 379.944770 (including connections establishing)
>  tps = 480.211639 (excluding connections establishing)

Looks fine. Also the -C cases looks sainer.

(- is unpatched, + is patched)

> $ pgbench -T 5 -j 1 -c 4 -C (with 100ms sleep in doConnect())
### time_include=5.1, conn_total_time=4.9
- number of transactions actually processed: 43
- latency average = 469.850 ms
- tps = 8.513360 (including connections establishing)
- tps = 11.246110 (excluding connections establishing)
+ number of transactions actually processed: 38
+ latency average = 16.735 ms
+ tps = 7.493800 (including connections establishing)
+ tps = 239.022345 (excluding connections establishing)


> $ pgbench -T 5 -j 4 -c 4 -C (with 100ms sleep in doConnect())
### time_include=5.1, conn_total_time=18.3
-number of transactions actually processed: 172
-latency average = 117.571 ms
-tps = 34.021873 (including connections establishing)
-tps = 346.296591 (excluding connections establishing)
+ number of transactions actually processed: 172
+ latency average = 11.671 ms
+ tps = 33.851599 (including connections establishing)
+ tps = 342.716081 (excluding connections establishing)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 117118699e24085db209ab66698cb33c7a70fd21 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 17 Sep 2020 17:20:10 +0900
Subject: [PATCH] Fix latency and tps calculation of pgbench

Fix the calculation for "latency average" and the "tps excluding
connections establishing" not to wrongly affected by connections
establishing time.
---
 src/bin/pgbench/pgbench.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 332eabf637..adb522ed77 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5194,16 +5194,35 @@ printResults(StatsData *total, instr_time total_time,
              instr_time conn_total_time, int64 latency_late)
 {
     double        time_include,
+                time_exclude,
                 tps_include,
                 tps_exclude;
     int64        ntx = total->cnt - total->skipped;
 
     time_include = INSTR_TIME_GET_DOUBLE(total_time);
 
+    /*
+     * conn_total_time is the sum of the time each clients took to establish a
+     * connection. In the multi-threaded case, all clients run on a thread
+     * waits for all the clients to establish a connection. So the actual total
+     * connection time of a thread is thread->conn_time * thread->nstate. So
+     * the total time took for connection establishment is:
+     *
+     *   sum(thread->conn_time * thread->nstate) / nclients
+     *
+     * Assuming all client took the same time to connection establishment and
+     * clients are distributed equally to threads, the expression is
+     * approximated as:
+     *
+     *   thread->conn_time * (nclients/nthreads) / nclients
+     * = conn_total_time / nthreads
+     */
+    time_exclude = (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) /
+                                    nthreads));
+
     /* tps is about actually executed transactions */
     tps_include = ntx / time_include;
-    tps_exclude = ntx /
-        (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+    tps_exclude = ntx / time_exclude;
 
     /* Report test parameters. */
     printf("transaction type: %s\n",
@@ -5249,7 +5268,7 @@ printResults(StatsData *total, instr_time total_time,
     {
         /* no measurement, show average latency computed from run time */
         printf("latency average = %.3f ms\n",
-               1000.0 * time_include * nclients / total->cnt);
+               1000.0 * time_exclude * nclients / total->cnt);
     }
 
     if (throttle_delay)
-- 
2.18.4

Re: pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

17 September 2020, 08:52:23

Mmm..

At Thu, 17 Sep 2020 17:41:54 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> While running pgbench, I found tps excluding connection somewhat
> strange. To emphasize the behavior, I inserted some sleep at the end
> of doConnect() and ran pgbench with several times.
<patch attached>

Sorry, I sent a wrong version of the patch, contains some spelling
errors. This is the right one.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 77d29d47db572462054a050c22c76d1912b6c4d4 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 17 Sep 2020 17:20:10 +0900
Subject: [PATCH v2] Fix latency and tps calculation of pgbench

Fix the calculation for "latency average" and the "tps excluding
connections establishing" not to wrongly affected by connections
establishing time.
---
 src/bin/pgbench/pgbench.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 332eabf637..edee25f12a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5194,16 +5194,35 @@ printResults(StatsData *total, instr_time total_time,
              instr_time conn_total_time, int64 latency_late)
 {
     double        time_include,
+                time_exclude,
                 tps_include,
                 tps_exclude;
     int64        ntx = total->cnt - total->skipped;
 
     time_include = INSTR_TIME_GET_DOUBLE(total_time);
 
+    /*
+     * conn_total_time is the sum of the time each client took to establish a
+     * connection. In the multi-threaded case, all clients run on a thread wait
+     * for all the clients to establish a connection. So the actual total
+     * connection time of a thread is thread->conn_time * thread->nstate. Thus
+     * the total time took for connection establishment is:
+     *
+     *   sum(thread->conn_time * thread->nstate) / nclients
+     *
+     * Assuming all client took the same time to connection establishment and
+     * clients are distributed equally to threads, the expression is
+     * approximated as:
+     *
+     *   thread->conn_time * (nclients/nthreads) / nclients
+     * = conn_total_time / nthreads
+     */
+    time_exclude = (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) /
+                                    nthreads));
+
     /* tps is about actually executed transactions */
     tps_include = ntx / time_include;
-    tps_exclude = ntx /
-        (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+    tps_exclude = ntx / time_exclude;
 
     /* Report test parameters. */
     printf("transaction type: %s\n",
@@ -5249,7 +5268,7 @@ printResults(StatsData *total, instr_time total_time,
     {
         /* no measurement, show average latency computed from run time */
         printf("latency average = %.3f ms\n",
-               1000.0 * time_include * nclients / total->cnt);
+               1000.0 * time_exclude * nclients / total->cnt);
     }
 
     if (throttle_delay)
-- 
2.18.4

Re: pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

17 September 2020, 08:59:45

At Thu, 17 Sep 2020 17:52:23 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Sorry, I sent a wrong version of the patch, contains some spelling
> errors. This is the right one.

Sigh.. I fixed some more wordings.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From bd41d38cb056a6b8fe761266a55d0d46245fd9bf Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 17 Sep 2020 17:20:10 +0900
Subject: [PATCH v3] Fix latency and tps calculation of pgbench

Fix the calculation for "latency average" and the "tps excluding
connections establishing" not to wrongly affected by connections
establishing time.
---
 src/bin/pgbench/pgbench.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 332eabf637..3fd86eeb8a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5194,16 +5194,35 @@ printResults(StatsData *total, instr_time total_time,
              instr_time conn_total_time, int64 latency_late)
 {
     double        time_include,
+                time_exclude,
                 tps_include,
                 tps_exclude;
     int64        ntx = total->cnt - total->skipped;
 
     time_include = INSTR_TIME_GET_DOUBLE(total_time);
 
+    /*
+     * conn_total_time is the sum of the time each client took to establish a
+     * connection. In the multi-threaded case, all clients run on a thread wait
+     * for all the clients to establish a connection. So the actual total
+     * connection time of a thread is thread->conn_time * thread->nstate. Thus
+     * the total time took for connection establishment is:
+     *
+     *   sum(thread->conn_time * thread->nstate) / nclients
+     *
+     * Assuming all clients took the same time to establish a connection and
+     * clients are distributed equally to threads, the expression is
+     * approximated as:
+     *
+     *   thread->conn_time * (nclients/nthreads) / nclients
+     * = conn_total_time / nthreads
+     */
+    time_exclude = (time_include -
+                    (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
+
     /* tps is about actually executed transactions */
     tps_include = ntx / time_include;
-    tps_exclude = ntx /
-        (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+    tps_exclude = ntx / time_exclude;
 
     /* Report test parameters. */
     printf("transaction type: %s\n",
@@ -5249,7 +5268,7 @@ printResults(StatsData *total, instr_time total_time,
     {
         /* no measurement, show average latency computed from run time */
         printf("latency average = %.3f ms\n",
-               1000.0 * time_include * nclients / total->cnt);
+               1000.0 * time_exclude * nclients / total->cnt);
     }
 
     if (throttle_delay)
-- 
2.18.4

Re: pgbench calculates summary numbers a wrong way.

From

Fabien COELHO

Date:

18 September 2020, 07:55:30

Hello,

>> Sorry, I sent a wrong version of the patch, contains some spelling
>> errors. This is the right one.
>
> Sigh.. I fixed some more wordings.

I have submitted a patch which reworks how things are computed so that 
performance figures make some/more sense, among other things.

Maybe you could have a look at it and tell whether it is an improvement 
wrt your issue?

https://commitfest.postgresql.org/29/2557/

-- 
Fabien.

Re: pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

18 September 2020, 08:23:36

At Thu, 17 Sep 2020 17:59:45 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Sigh.. I fixed some more wordings.

The condition "all clients took the same time to establish a
connection" is not needed for the transformation, and an intermediate
expression was wrong. Fixed them.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 802c06d95aebdb8a4ed4e3adc4a6c74d9675c625 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 17 Sep 2020 17:20:10 +0900
Subject: [PATCH v4] Fix latency and tps calculation of pgbench

Fix the calculation for "latency average" and the "tps excluding
connections establishing" not to wrongly affected by connections
establishing time.
---
 src/bin/pgbench/pgbench.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 663d7d292a..4164a546e1 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5194,16 +5194,34 @@ printResults(StatsData *total, instr_time total_time,
              instr_time conn_total_time, int64 latency_late)
 {
     double        time_include,
+                time_exclude,
                 tps_include,
                 tps_exclude;
     int64        ntx = total->cnt - total->skipped;
 
     time_include = INSTR_TIME_GET_DOUBLE(total_time);
 
+    /*
+     * conn_total_time is the sum of the time each client took to establish a
+     * connection. In the multi-threaded case, all clients run on a thread wait
+     * for all the clients to establish a connection. So the actual total
+     * connection time of a thread is thread->conn_time * thread->nstate. Thus
+     * the total time took for connection establishment is:
+     *
+     *   sum(thread->conn_time * thread->nstate) / nclients
+     *
+     * Assuming clients are distributed equally to threads, the expression is
+     * approximated as:
+     *
+     *   sum(thread->conn_time) * (nclients/nthreads) / nclients
+     * = conn_total_time / nthreads
+     */
+    time_exclude = (time_include -
+                    (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
+
     /* tps is about actually executed transactions */
     tps_include = ntx / time_include;
-    tps_exclude = ntx /
-        (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+    tps_exclude = ntx / time_exclude;
 
     /* Report test parameters. */
     printf("transaction type: %s\n",
@@ -5249,7 +5267,7 @@ printResults(StatsData *total, instr_time total_time,
     {
         /* no measurement, show average latency computed from run time */
         printf("latency average = %.3f ms\n",
-               1000.0 * time_include * nclients / total->cnt);
+               1000.0 * time_exclude * nclients / total->cnt);
     }
 
     if (throttle_delay)
-- 
2.18.4

Re: pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

18 September 2020, 08:28:22

At Fri, 18 Sep 2020 09:55:30 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in 
> 
> Hello,
> 
> >> Sorry, I sent a wrong version of the patch, contains some spelling
> >> errors. This is the right one.
> >
> > Sigh.. I fixed some more wordings.
> 
> I have submitted a patch which reworks how things are computed so that
> performance figures make some/more sense, among other things.
> 
> Maybe you could have a look at it and tell whether it is an
> improvement wrt your issue?

Thanks for the pointer, I'm going to look it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench calculates summary numbers a wrong way.

From

Kyotaro Horiguchi

Date:

18 September 2020, 09:39:53

At Fri, 18 Sep 2020 17:28:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > I have submitted a patch which reworks how things are computed so that
> > performance figures make some/more sense, among other things.
> > 
> > Maybe you could have a look at it and tell whether it is an
> > improvement wrt your issue?
> 
> Thanks for the pointer, I'm going to look it.

I remember about the topic. It started from an issue of thread/client
connection synchronization.  Current it seems to be added overhauling
of time counting code. (I think they are in separate patches).
However, the patch doesn't fix my issues.

If I re-explain my issues in few words:

The first issue is: in printResuts, conn_total_delay is assumed to be
the sum of connection delay of all clients. But actually it is the sum
of connection delay of all *threads*. As the results the connection
delay is underestimated when nclients is larger than nthreads, then
tps(excluding conn establishment) is too-small on that condition.


The second issue is: estimated latency average (without -L) is
calculated as time_include * nclients / <# of completed
txns>. Considering that the latency is the time after connection
establishement until transaction end, the "time_include" should be
time_exclude.  As the result the estimated (non -L case) average
latency gets overestimated than the measured latency (with -L case) if
connections takes a long time.

The patch doesn't affect the first issue, but alleviates the error in
the second issue. But still it doesn't handles connection delay
correctly in calculating the value when non -C mode, and doesn't fix
the error when -C (conn per tx) mode at all.

So the two patches are almost orthogonal (but heavily conflicts each
other^^;).  Anyway, it's a good opportunity, so I'll take a look on
your patch.

Thanks.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center