Thread: Doubt in pgbench TPS number

Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
Today I got an interesting output from pgbench.
The scenario is executing SELECT pg_sleep(0.1).

$ cat test.sql 
select pg_sleep(0.1);

$ pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2883
latency average: 104.058 ms
tps = 95.799114 (including connections establishing)
tps = 124.487149 (excluding connections establishing)

Interesting thing is, the number from "excluding connections
establishing". 124.487149 tps means 0.008032 seconds per
transaction. Since the query executes pg_sleep(0.1), I think the
number should be equal to or greater than 0.1. Maybe a tolerance, but
20% of error seems to be too high for me.

Note that if "-C" does not present, the TPS number seems to be sane.

$ pgbench -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2970
latency average: 101.010 ms
tps = 98.692514 (including connections establishing)
tps = 98.747053 (excluding connections establishing)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
Hello Tatsuo-san,

> $ pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
> tps = 95.799114 (including connections establishing)
> tps = 124.487149 (excluding connections establishing)
>
> Interesting thing is, the number from "excluding connections
> establishing". 124.487149 tps means 0.008032 seconds per
> transaction. Since the query executes pg_sleep(0.1), I think the
> number should be equal to or greater than 0.1. Maybe a tolerance, but
> 20% of error seems to be too high for me.

Indeed.

> Note that if "-C" does not present, the TPS number seems to be sane.

Hmmm... I never use -C. The formula seems ok:
    tps_exclude = normal_xacts / (time_include -                        (INSTR_TIME_GET_DOUBLE(conn_total_time) /
nthreads));

conn_total_time is the cumulated time spent by all threads.

A quick look at the logic reveals some minor issues (conn_time is passed 
as an argument to doCustom, although it is already available as a "thread" 
field, stange). I spotted "threads" used instead of "thread" in an 
accumulation, but it is not related to this computation.

Hmmm. I'll have a slower look...

-- 
Fabien.



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
Hello Tatsuo,

> Hmmm... I never use -C. The formula seems ok:
>
>    tps_exclude = normal_xacts / (time_include -
>                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));

Hmmm... it is not:-)

I think that the degree of parallelism to consider is nclients, not 
nthreads: while connection time is accumulated in conn_time, other clients 
are possibly doing their transactions, in parallel, even if it is in the 
same thread, so it is not "stopped time" for all clients. It starts to 
matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in 
each thread may be arbitrary close to the whole time if there are many 
clients.

Now, I do not think that this tps computation makes that much sense. If 
you want to know the tps without reconnect, run without reconnecting... It 
is clear that I do not get this figure when running without -C, so maybe
the tps with/without reconnection should be dropped?

Anyway, here is a patch, which: - fixes the "exclude" computation (s/nthreads/nclients/) - fixes the count total for
skipped(s/threads/thread/) - removes a useless parameter to doCustom   (the conn_time is available through the thread
parameter).

-- 
Fabien.

Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
> I think that the degree of parallelism to consider is nclients, not
> nthreads: while connection time is accumulated in conn_time, other
> clients are possibly doing their transactions, in parallel, 

I'm not sure about this. I think pgbench will not start transactions
until all clients establish connections to PostgreSQL.

I found this while playing with pgpool-II. pgpool-II pre-forks child
process, whose number is defined by "num_init_children"
directive. What I observed was, pgbench starts connecting to pgpool-II
until "-c" connections are established. So, if "-c" is larger than
"num_init_children", no transaction starts.

> even if it
> is in the same thread, so it is not "stopped time" for all clients. It
> starts to matter with "-j 1 -c 30" and slow transactions, the
> cumulated conn_time in each thread may be arbitrary close to the whole
> time if there are many clients.
> 
> Now, I do not think that this tps computation makes that much
> sense. If you want to know the tps without reconnect, run without
> reconnecting... It is clear that I do not get this figure when running
> without -C, so maybe
> the tps with/without reconnection should be dropped?
> 
> Anyway, here is a patch, which:
>  - fixes the "exclude" computation (s/nthreads/nclients/)
>  - fixes the count total for skipped (s/threads/thread/)
>  - removes a useless parameter to doCustom
>    (the conn_time is available through the thread parameter).

I have tested your patch. It seems the tolerance is much better than
before with your patch.

With the patch:
./pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2887
latency average: 103.914 ms
tps = 95.896850 (including connections establishing)
tps = 98.101662 (excluding connections establishing)

Without the patch:
./pgbench -C -n -p 11002 -c 10 -T 30 -f test.sql  test
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 30 s
number of transactions actually processed: 2887
latency average: 103.914 ms
tps = 95.919415 (including connections establishing)
tps = 124.732475 (excluding connections establishing)

I'm going to commit your patch if there's no objection.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
> I have tested your patch. It seems the tolerance is much better than
> before with your patch.

[snip]

> I'm going to commit your patch if there's no objection.

I think we should commit this to master and 9.5 stable branch only
because the fix significantly changes the benchmark result of pgbench.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
Hello Tatsuo-san,

>> I think that the degree of parallelism to consider is nclients, not
>> nthreads: while connection time is accumulated in conn_time, other
>> clients are possibly doing their transactions, in parallel,
>
> I'm not sure about this. I think pgbench will not start transactions
> until all clients establish connections to PostgreSQL.

I think that is true if "!is_connect", all client connections are 
performed at the beginning of threadRun, but under -C each client has its 
own connect/deconnect integrated within doCustom, so it is done in 
parallel to other clients having their transactions processed, hence the 
issue with the formula.

> I found this while playing with pgpool-II. pgpool-II pre-forks child
> process, whose number is defined by "num_init_children"
> directive. What I observed was, pgbench starts connecting to pgpool-II
> until "-c" connections are established. So, if "-c" is larger than
> "num_init_children", no transaction starts.

Yep, unless -C, I think, where some client may start nevertheless? Not 
sure... Does not matter.

> I have tested your patch. It seems the tolerance is much better than
> before with your patch.

Yep.

> I'm going to commit your patch if there's no objection.

This seems fine with me.

The formula change, and just this one, should probably be backported 
somehow, as this is a bug, wherever pgbench resides in older versions. It 
is just 's/nthreads/nclients/' in the printResult formula for computing 
tps_exclude.

-- 
Fabien.



Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
>> I'm not sure about this. I think pgbench will not start transactions
>> until all clients establish connections to PostgreSQL.
> 
> I think that is true if "!is_connect", all client connections are
> performed at the beginning of threadRun, but under -C each client has
> its own connect/deconnect integrated within doCustom, so it is done in
> parallel to other clients having their transactions processed, hence
> the issue with the formula.

Really?

I have tested with pgpool-II which is set to accept up to 2
connections, then run pgbench with -C and -c 32. pgbench was blocked
as expected and I attached gdb and got stack trace:

(gdb) bt
#0  0x00007f48d5f17110 in __poll_nocancel ()   at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f48d6724056 in pqSocketCheck ()  from /usr/local/src/pgsql/current/lib/libpq.so.5
#2  0x00007f48d6724940 in pqWaitTimed ()  from /usr/local/src/pgsql/current/lib/libpq.so.5
#3  0x00007f48d671f3e2 in connectDBComplete ()  from /usr/local/src/pgsql/current/lib/libpq.so.5
#4  0x00007f48d671fbcf in PQconnectdbParams ()  from /usr/local/src/pgsql/current/lib/libpq.so.5
#5  0x0000000000402b2b in doConnect () at pgbench.c:650
#6  0x0000000000404591 in doCustom (thread=0x25c2f40, st=0x25c2770,    conn_time=0x25c2f90, logfile=0x0,
agg=0x7fffba224260)at pgbench.c:1353
 
#7  0x000000000040a1d5 in threadRun (arg=0x25c2f40) at pgbench.c:3581
#8  0x0000000000409ab4 in main (argc=12, argv=0x7fffba224668) at pgbench.c:3455

As you can see, one of threads wants to connect to PostgreSQL
(actually pgpool-II) and waits for reply.

In threadRun() around line 3581:
/* send start up queries in async manner */for (i = 0; i < nstate; i++){    CState       *st = &state[i];    Command
**commands= sql_files[st->use_file];    int            prev_ecnt = st->ecnt;
 
    st->use_file = getrand(thread, 0, num_files - 1);    if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
      remains--;            /* I've aborted */
 
    if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)    {        fprintf(stderr, "client %d
abortedin state %d; execution of meta-command failed\n",                i, st->state);        remains--;            /*
I'veaborted */        PQfinish(st->con);        st->con = NULL;    }}
 

Here doCustome() is called with st->con == NULL. In doCustom() around
line 1353:
if (st->con == NULL){    instr_time    start,                end;
    INSTR_TIME_SET_CURRENT(start);    if ((st->con = doConnect()) == NULL)    {

doConnect() blocks until PostgreSQL (pgpool-II) allows to be
connected.

Because outer loop in threadRun() wants to loop over until all threads
succefully connects to PostgreSQL, pgbench is blocked here.
/* send start up queries in async manner */for (i = 0; i < nstate; i++)

>> I'm going to commit your patch if there's no objection.
> 
> This seems fine with me.
> 
> The formula change, and just this one, should probably be backported
> somehow, as this is a bug, wherever pgbench resides in older
> versions. It is just 's/nthreads/nclients/' in the printResult formula
> for computing tps_exclude.

Yeah, that's definitely a bug but I'm afraid the fix will change the
TPS number and may break the backward compatibility. Since we have
lived with bug for years, I hesitate to back port to older stable
branches...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
> #5  0x0000000000402b2b in doConnect () at pgbench.c:650
> #6  0x0000000000404591 in doCustom (thread=0x25c2f40, st=0x25c2770,
>    conn_time=0x25c2f90, logfile=0x0, agg=0x7fffba224260) at pgbench.c:1353
> #7  0x000000000040a1d5 in threadRun (arg=0x25c2f40) at pgbench.c:3581
> #8  0x0000000000409ab4 in main (argc=12, argv=0x7fffba224668) at pgbench.c:3455

Hmm. Ok, whatever the connection position (from doCustom or from 
threadRun), doConnect would block the thread. On the other hand, if you 
start several threads, probably those threads which could connect all 
their client would proceed.

What looks to be needed would be some timeout when connecting to the 
server.

>> The formula change, and just this one, should probably be backported
>> somehow, as this is a bug, wherever pgbench resides in older
>> versions. It is just 's/nthreads/nclients/' in the printResult formula
>> for computing tps_exclude.
>
> Yeah, that's definitely a bug but I'm afraid the fix will change the
> TPS number and may break the backward compatibility. Since we have
> lived with bug for years, I hesitate to back port to older stable
> branches...

My 2¥: I do not think of a good argument to keep wrong tps numbers once it 
is known that there are plain wrong, especially as it is not a behavioral 
change as such which could block applications or whatever, just a 
different number printed at the end of a run. So I would not bother much 
with upward compatibility consideration in this case.

-- 
Fabien.

Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
>> Yeah, that's definitely a bug but I'm afraid the fix will change the
>> TPS number and may break the backward compatibility. Since we have
>> lived with bug for years, I hesitate to back port to older stable
>> branches...
>
> My 2¥: I do not think of a good argument to keep wrong tps numbers
> once it is known that there are plain wrong, especially as it is not a
> behavioral change as such which could block applications or whatever,
> just a different number printed at the end of a run. So I would not
> bother much with upward compatibility consideration in this case.
>
> --
> Fabien.

FYI, the bug was there since the thread support was introduced in commit:

da0dfb4b1460c3701abc8ed5f516d138dc4654c

That was in 9.0, which was released in 2010.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> Yeah, that's definitely a bug but I'm afraid the fix will change the
>> TPS number and may break the backward compatibility. Since we have
>> lived with bug for years, I hesitate to back port to older stable
>> branches...

> My 2�: I do not think of a good argument to keep wrong tps numbers once it 
> is known that there are plain wrong, especially as it is not a behavioral 
> change as such which could block applications or whatever, just a 
> different number printed at the end of a run. So I would not bother much 
> with upward compatibility consideration in this case.

FWIW, I vote with Tatsuo-san.  Such a change will break comparability of
results with all previous versions, which means it's not something to do
in minor releases, even if we now believe the previous results were
somewhat bogus.  Arguing that it's "not a behavioral change" seems
quite loony to me: for most people, the TPS numbers are the only
interesting output of pgbench.

I think we should fix it in 9.5 and document it as an incompatible change.

(Note: I've not read the patch, so this is not an opinion about its
correctness.)
        regards, tom lane



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
Hello Tom,

> FWIW, I vote with Tatsuo-san.  Such a change will break comparability of
> results

I would not classify a performance measure as a "result compatibility" 
issue. What matters is the *correction* of the results.

When a bug is fixed anywhere in pg it may change performance significantly 
for some tests, and I have never seen this as a relevant consideration not 
to fix a problem...

> with all previous versions, which means it's not something to do
> in minor releases, even if we now believe the previous results were
> somewhat bogus.  Arguing that it's "not a behavioral change" seems
> quite loony to me: for most people, the TPS numbers are the only
> interesting output of pgbench.

I think that if people are interested in tps without reconnecting on each 
transaction they would not run with "-C" to trigger reconnections and then 
look at the "tps without connections" for the figure they want... so I do 
not think that keeping this error is worth anything.

On the other hand, and on the principle, keeping the bug looks wrong. I 
cannot agree with the logic behind shipping something which is known bad, 
especially to display *optimistic* performances... It looks too much like 
the VW way:-)

> I think we should fix it in 9.5 and document it as an incompatible 
> change.

Hmm.

> (Note: I've not read the patch, so this is not an opinion about its
> correctness.)

That is another question:-)

-- 
Fabien.



Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
> FWIW, I vote with Tatsuo-san.  Such a change will break comparability of
> results with all previous versions, which means it's not something to do
> in minor releases, even if we now believe the previous results were
> somewhat bogus.  Arguing that it's "not a behavioral change" seems
> quite loony to me: for most people, the TPS numbers are the only
> interesting output of pgbench.
> 
> I think we should fix it in 9.5 and document it as an incompatible change.

Plus if 9.4 or earlier version of PostgreSQL user wants to use fixed
version of pgbench, he/she can always use pgbench coming with 9.5 or
later version of PostgreSQL.

> (Note: I've not read the patch, so this is not an opinion about its
> correctness.)

As Fabien anayzed the problem was that pgbench simply uses wrong
variable: nthreads (number of threads, specified by "-j" option)
vs. nclients (number of clients, specified by "-c" option).

@@ -2616,7 +2616,7 @@ printResults(int ttype, int64 normal_xacts, int nclients,    time_include =
INSTR_TIME_GET_DOUBLE(total_time);   tps_include = normal_xacts / time_include;    tps_exclude = normal_xacts /
(time_include-
 
-                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
+                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));

Here conn_total_time is the sum of time to establish connection to
PostgreSQL. Since establishing connections to PostgreSQL is done in
serial rather than in parallel, conn_total_time should have been
divided by nclients.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
>> (Note: I've not read the patch, so this is not an opinion about its
>> correctness.)
>
> As Fabien anayzed the problem was that pgbench simply uses wrong
> variable: nthreads (number of threads, specified by "-j" option)
> vs. nclients (number of clients, specified by "-c" option).
>
> @@ -2616,7 +2616,7 @@ printResults(int ttype, int64 normal_xacts, int nclients,
>     time_include = INSTR_TIME_GET_DOUBLE(total_time);
>     tps_include = normal_xacts / time_include;
>     tps_exclude = normal_xacts / (time_include -
> -                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
> +                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
>
> Here conn_total_time is the sum of time to establish connection to
> PostgreSQL. Since establishing connections to PostgreSQL is done in
> serial rather than in parallel, conn_total_time should have been
> divided by nclients.

After some more thinking and looking again at the connection code, I 
revise slightly my diagnostic:
 - the amount of parallelism is "nclients", as discussed above, when 
reconnecting on each transaction (-C) because the connections are managed 
in parallel from doCustom,

* BUT *
 - if there is no reconnections (not -C) the connections are performed in 
threadRun in a sequential way, all clients wait for the connections of 
other clients in the same thread before starting processing transactions, 
so "nthreads" is the right amount of parallelism in this case.

So on second thought the formula should rather be:
  ...  / (is_connect? nclients: nthreads)

Here is a v2 with this formula. Note that it does not take into account 
thread creation time, which might be significant, the start and end of a 
pgbench run are quite fuzzy.

I've removed the doCustom parameter change as it is included in a larger 
patch I submitted about reworking stat collections in pgbench, so this 
attached patch is bug fix only.

For the record, I still plainly disagree with the idea of shipping a 
performance measuring tool which knowingly displays wrong and optimistic 
figures under some conditions.

-- 
Fabien.

Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
>> Here conn_total_time is the sum of time to establish connection to
>> PostgreSQL. Since establishing connections to PostgreSQL is done in
>> serial rather than in parallel, conn_total_time should have been
>> divided by nclients.
> 
> After some more thinking and looking again at the connection code, I
> revise slightly my diagnostic:
> 
>  - the amount of parallelism is "nclients", as discussed above, when
>  - reconnecting on each transaction (-C) because the connections are
>  - managed in parallel from doCustom,
> 
> * BUT *
> 
>  - if there is no reconnections (not -C) the connections are performed in
>  - threadRun in a sequential way, all clients wait for the connections of
>  - other clients in the same thread before starting processing
>  - transactions, so "nthreads" is the right amount of parallelism in this
>  - case.
> 
> So on second thought the formula should rather be:
> 
>   ...  / (is_connect? nclients: nthreads)

I don't think this is quite correct.

If is_connect is false, then following loop is executed in threadRun():
    /* make connections to the database */    for (i = 0; i < nstate; i++)    {        if ((state[i].con = doConnect())
==NULL)            goto done;    }
 

Here, nstate is nclients/nthreads. Suppose nclients = 16 and nthreads
= 2, then 2 threads run in parallel, and each thread is connecting 8
times (nstate = 8) in *serial*. The total connection time for this
thread is calculated by "the time ends the loop" - "the time starts
the loop". So if the time to establish a connection is 1 second, the
total connection time for a thread will be 8 seconds. Thus grand total
of connection time will be 2 * 8 = 16 seconds.

If is_connect is true, following loop is executed.
/* send start up queries in async manner */for (i = 0; i < nstate; i++){    CState       *st = &state[i];    Command
**commands= sql_files[st->use_file];    int            prev_ecnt = st->ecnt;
 
    st->use_file = getrand(thread, 0, num_files - 1);    if (!doCustom(thread, st, &thread->conn_time, logfile,
&aggs))

In the loop, exactly same thing happens as is_connect = false case. If
t = 1, total connection time will be same as is_connect = false case,
i.e. 16 seconds.

In summary, I see no reason to change the v1 patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Tatsuo Ishii
Date:
>> FWIW, I vote with Tatsuo-san.  Such a change will break comparability of
>> results with all previous versions, which means it's not something to do
>> in minor releases, even if we now believe the previous results were
>> somewhat bogus.  Arguing that it's "not a behavioral change" seems
>> quite loony to me: for most people, the TPS numbers are the only
>> interesting output of pgbench.
>> 
>> I think we should fix it in 9.5 and document it as an incompatible change.
> 
> Plus if 9.4 or earlier version of PostgreSQL user wants to use fixed
> version of pgbench, he/she can always use pgbench coming with 9.5 or
> later version of PostgreSQL.
> 
>> (Note: I've not read the patch, so this is not an opinion about its
>> correctness.)
> 
> As Fabien anayzed the problem was that pgbench simply uses wrong
> variable: nthreads (number of threads, specified by "-j" option)
> vs. nclients (number of clients, specified by "-c" option).
> 
> @@ -2616,7 +2616,7 @@ printResults(int ttype, int64 normal_xacts, int nclients,
>      time_include = INSTR_TIME_GET_DOUBLE(total_time);
>      tps_include = normal_xacts / time_include;
>      tps_exclude = normal_xacts / (time_include -
> -                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
> +                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
> 
> Here conn_total_time is the sum of time to establish connection to
> PostgreSQL. Since establishing connections to PostgreSQL is done in
> serial rather than in parallel, conn_total_time should have been
> divided by nclients.

Fix committed to master and 9.5 stable branches.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Doubt in pgbench TPS number

From
Fabien COELHO
Date:
Hello Tatsuo,

>> So on second thought the formula should rather be:
>>
>>   ...  / (is_connect? nclients: nthreads)
>
> I don't think this is quite correct.
>
> If is_connect is false, then following loop is executed in threadRun():
>
>         /* make connections to the database */
>         for (i = 0; i < nstate; i++)
>         {
>             if ((state[i].con = doConnect()) == NULL)
>                 goto done;
>         }

Yep. The loop initializes all client connections *BEFORE* starting any 
transactions on any client, the thread does only do connections at this 
time, which is included conn_time.

> Here, nstate is nclients/nthreads. Suppose nclients = 16 and nthreads
> = 2, then 2 threads run in parallel, and each thread is connecting 8
> times (nstate = 8) in *serial*.

Yes.

> The total connection time for this thread is calculated by "the time 
> ends the loop" - "the time starts the loop". So if the time to establish 
> a connection is 1 second, the total connection time for a thread will be 
> 8 seconds. Thus grand total of connection time will be 2 * 8 = 16 
> seconds.

Yes, 16 seconds in 2 threads, 8 seconds per thread of the "real time" of 
the test is spend in the connection, and no

> If is_connect is true, following loop is executed.
>
>     /* send start up queries in async manner */
>     for (i = 0; i < nstate; i++)
>     {
>         CState       *st = &state[i];
>         Command   **commands = sql_files[st->use_file];
>         int            prev_ecnt = st->ecnt;
>
>         st->use_file = getrand(thread, 0, num_files - 1);
>         if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
>
> In the loop, exactly same thing happens as is_connect = false case. If
> t = 1, total connection time will be same as is_connect = false case,
> i.e. 16 seconds.

Without -C, 1 thread, 2 clients, if transactions take same time as 
connections:
 Client 1:  C-|TTTTTTTTTTTTTTTTTTTTTTTTTTT Client 2:  -C|TTTTTTTTTTTTTTTTTTTTTTTTTTT            <> connection time
(initialloop in threadRun)            <----------------------------> whole execution time
<------------------------->transaction time
 

The actual transaction time to consider on this drawing is whole time 
minus the connection time of the thread, which is serialised. It is not 
whole execution time minus connection time / 2 (number of clients), 
because of the '|' synchronisation (clients do not start before all other 
clients of the thread are connected).

With -C, the is no initial connection, the connections are managed within
doCustom, which is doing transaction processing asynchronously.
 Client 1:  CTCTCTCTCTCTCTCTCTCTCTCTCTCT- Client 2:  -CTCTCTCTCTCTCTCTCTCTCTCTCTCT
<--------------------------->whole execution time            <-------------------------->  measured connection time
 

While a client is connecting, the other client is performing its 
transaction in an asynchronous manner, so the measured connection time may 
be arbitrary close to the execution time, this was the bug you detected.

> In summary, I see no reason to change the v1 patch.

I still think that my revised thinking is the right one... I hope that the 
above drawings make my thinking clearer. For me, the initial formula was 
the right one when not using -C, only the -C case need fixing.

-- 
Fabien.



Re: [HACKERS] Doubt in pgbench TPS number

From
Andres Freund
Date:
Hi,

On 2015-09-25 20:35:45 +0200, Fabien COELHO wrote:
> 
> Hello Tatsuo,
> 
> > Hmmm... I never use -C. The formula seems ok:
> > 
> >    tps_exclude = normal_xacts / (time_include -
> >                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
> 
> Hmmm... it is not:-)
> 
> I think that the degree of parallelism to consider is nclients, not
> nthreads: while connection time is accumulated in conn_time, other clients
> are possibly doing their transactions, in parallel, even if it is in the
> same thread, so it is not "stopped time" for all clients. It starts to
> matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in
> each thread may be arbitrary close to the whole time if there are many
> clients.

I think this pretty much entirely broke the tps_exclude logic when not
using -C, especially when -c and -j differ. The wait time there is
actually per thread, not per client.

In this example I set post_auth_delay=1s on the server. Pgbench
tells me:
pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S
tps = 897607.544862 (including connections establishing)
tps = 1004793.708611 (excluding connections establishing)

pgbench -M prepared -c 180 -j 60 -T 10 -P1 -S
tps = 739502.979613 (including connections establishing)
tps = 822639.038779 (excluding connections establishing)

pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S
tps = 376468.177081 (including connections establishing)
tps = 418554.527585 (excluding connections establishing)

which pretty obviously is bogus. While I'd not expect it'd to work
perfectly, the "excluding" number should stay roughly constant.


The fundamental issue is that without -C *none* of the connections in
each thread gets to actually execute work before all of them have
established a connection. So dividing conn_total_time by / nclients is
wrong.

For more realistic connection delays this leads to the 'excluding'
number being way too close to the 'including' number, even if a
substantial portion of the time is spent on it.

Greetings,

Andres Freund



Re: [HACKERS] Doubt in pgbench TPS number

From
Andres Freund
Date:
On 2020-02-27 12:26:36 -0800, Andres Freund wrote:
> Hi,
> 
> On 2015-09-25 20:35:45 +0200, Fabien COELHO wrote:
> > 
> > Hello Tatsuo,
> > 
> > > Hmmm... I never use -C. The formula seems ok:
> > > 
> > >    tps_exclude = normal_xacts / (time_include -
> > >                        (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
> > 
> > Hmmm... it is not:-)
> > 
> > I think that the degree of parallelism to consider is nclients, not
> > nthreads: while connection time is accumulated in conn_time, other clients
> > are possibly doing their transactions, in parallel, even if it is in the
> > same thread, so it is not "stopped time" for all clients. It starts to
> > matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in
> > each thread may be arbitrary close to the whole time if there are many
> > clients.
> 
> I think this pretty much entirely broke the tps_exclude logic when not
> using -C, especially when -c and -j differ. The wait time there is
> actually per thread, not per client.
> 
> In this example I set post_auth_delay=1s on the server. Pgbench
> tells me:
> pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S
> tps = 897607.544862 (including connections establishing)
> tps = 1004793.708611 (excluding connections establishing)
> 
> pgbench -M prepared -c 180 -j 60 -T 10 -P1 -S
> tps = 739502.979613 (including connections establishing)
> tps = 822639.038779 (excluding connections establishing)
> 
> pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S
> tps = 376468.177081 (including connections establishing)
> tps = 418554.527585 (excluding connections establishing)
> 
> which pretty obviously is bogus. While I'd not expect it'd to work
> perfectly, the "excluding" number should stay roughly constant.
> 
> 
> The fundamental issue is that without -C *none* of the connections in
> each thread gets to actually execute work before all of them have
> established a connection. So dividing conn_total_time by / nclients is
> wrong.
> 
> For more realistic connection delays this leads to the 'excluding'
> number being way too close to the 'including' number, even if a
> substantial portion of the time is spent on it.

Not suggesting it as an actual fix, but just multiplying the computed
conn_time within runThread() by the number of connections the thread
handles (for the non -C case) results in much more reasonable output.

pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S
before:
tps = 378393.985650 (including connections establishing)
tps = 420691.025930 (excluding connections establishing)
after:
tps = 379818.929680 (including connections establishing)
tps = 957405.785600 (excluding connections establishing)

pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S

before:
tps = 906662.031099 (including connections establishing)
tps = 1012223.500880 (excluding connections establishing)

after:
tps = 906178.154876 (including connections establishing)
tps = 1012431.154463 (excluding connections establishing)


The numbers are still a bit bogus because of thread startup overhead
being included, and conn_time being computed relative to the thread
creation. But they at least make some basic sense.

diff --git i/src/bin/pgbench/pgbench.c w/src/bin/pgbench/pgbench.c
index 1159757acb0..3bc45107136 100644
--- i/src/bin/pgbench/pgbench.c
+++ w/src/bin/pgbench/pgbench.c
@@ -6265,6 +6265,16 @@ threadRun(void *arg)
     INSTR_TIME_SET_CURRENT(thread->conn_time);
     INSTR_TIME_SUBTRACT(thread->conn_time, thread->start_time);
 
+    /* add once for each other connection */
+    if (!is_connect)
+    {
+        instr_time e = thread->conn_time;
+        for (i = 0; i < (nstate - 1); i++)
+        {
+            INSTR_TIME_ADD(thread->conn_time, e);
+        }
+    }
+
     /* explicitly initialize the state machines */
     for (i = 0; i < nstate; i++)
     {

Greetings,

Andres Freund