Re: pgbench calculates summary numbers a wrong way. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: pgbench calculates summary numbers a wrong way.
Date
Msg-id 20200917.175223.114069624543535216.horikyota.ntt@gmail.com
Whole thread Raw
In response to pgbench calculates summary numbers a wrong way.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: pgbench calculates summary numbers a wrong way.
List pgsql-hackers
Mmm..

At Thu, 17 Sep 2020 17:41:54 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> While running pgbench, I found tps excluding connection somewhat
> strange. To emphasize the behavior, I inserted some sleep at the end
> of doConnect() and ran pgbench with several times.
<patch attached>

Sorry, I sent a wrong version of the patch, contains some spelling
errors. This is the right one.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 77d29d47db572462054a050c22c76d1912b6c4d4 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 17 Sep 2020 17:20:10 +0900
Subject: [PATCH v2] Fix latency and tps calculation of pgbench

Fix the calculation for "latency average" and the "tps excluding
connections establishing" not to wrongly affected by connections
establishing time.
---
 src/bin/pgbench/pgbench.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 332eabf637..edee25f12a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5194,16 +5194,35 @@ printResults(StatsData *total, instr_time total_time,
              instr_time conn_total_time, int64 latency_late)
 {
     double        time_include,
+                time_exclude,
                 tps_include,
                 tps_exclude;
     int64        ntx = total->cnt - total->skipped;
 
     time_include = INSTR_TIME_GET_DOUBLE(total_time);
 
+    /*
+     * conn_total_time is the sum of the time each client took to establish a
+     * connection. In the multi-threaded case, all clients run on a thread wait
+     * for all the clients to establish a connection. So the actual total
+     * connection time of a thread is thread->conn_time * thread->nstate. Thus
+     * the total time took for connection establishment is:
+     *
+     *   sum(thread->conn_time * thread->nstate) / nclients
+     *
+     * Assuming all client took the same time to connection establishment and
+     * clients are distributed equally to threads, the expression is
+     * approximated as:
+     *
+     *   thread->conn_time * (nclients/nthreads) / nclients
+     * = conn_total_time / nthreads
+     */
+    time_exclude = (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) /
+                                    nthreads));
+
     /* tps is about actually executed transactions */
     tps_include = ntx / time_include;
-    tps_exclude = ntx /
-        (time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+    tps_exclude = ntx / time_exclude;
 
     /* Report test parameters. */
     printf("transaction type: %s\n",
@@ -5249,7 +5268,7 @@ printResults(StatsData *total, instr_time total_time,
     {
         /* no measurement, show average latency computed from run time */
         printf("latency average = %.3f ms\n",
-               1000.0 * time_include * nclients / total->cnt);
+               1000.0 * time_exclude * nclients / total->cnt);
     }
 
     if (throttle_delay)
-- 
2.18.4


pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Built-in connection pooler
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pgbench calculates summary numbers a wrong way.