Re: Doubt in pgbench TPS number - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: Doubt in pgbench TPS number |
Date | |
Msg-id | alpine.DEB.2.10.1509300818520.22913@sto Whole thread Raw |
In response to | Re: Doubt in pgbench TPS number (Tatsuo Ishii <ishii@postgresql.org>) |
List | pgsql-hackers |
Hello Tatsuo, >> So on second thought the formula should rather be: >> >> ... / (is_connect? nclients: nthreads) > > I don't think this is quite correct. > > If is_connect is false, then following loop is executed in threadRun(): > > /* make connections to the database */ > for (i = 0; i < nstate; i++) > { > if ((state[i].con = doConnect()) == NULL) > goto done; > } Yep. The loop initializes all client connections *BEFORE* starting any transactions on any client, the thread does only do connections at this time, which is included conn_time. > Here, nstate is nclients/nthreads. Suppose nclients = 16 and nthreads > = 2, then 2 threads run in parallel, and each thread is connecting 8 > times (nstate = 8) in *serial*. Yes. > The total connection time for this thread is calculated by "the time > ends the loop" - "the time starts the loop". So if the time to establish > a connection is 1 second, the total connection time for a thread will be > 8 seconds. Thus grand total of connection time will be 2 * 8 = 16 > seconds. Yes, 16 seconds in 2 threads, 8 seconds per thread of the "real time" of the test is spend in the connection, and no > If is_connect is true, following loop is executed. > > /* send start up queries in async manner */ > for (i = 0; i < nstate; i++) > { > CState *st = &state[i]; > Command **commands = sql_files[st->use_file]; > int prev_ecnt = st->ecnt; > > st->use_file = getrand(thread, 0, num_files - 1); > if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs)) > > In the loop, exactly same thing happens as is_connect = false case. If > t = 1, total connection time will be same as is_connect = false case, > i.e. 16 seconds. Without -C, 1 thread, 2 clients, if transactions take same time as connections: Client 1: C-|TTTTTTTTTTTTTTTTTTTTTTTTTTT Client 2: -C|TTTTTTTTTTTTTTTTTTTTTTTTTTT <> connection time (initialloop in threadRun) <----------------------------> whole execution time <------------------------->transaction time The actual transaction time to consider on this drawing is whole time minus the connection time of the thread, which is serialised. It is not whole execution time minus connection time / 2 (number of clients), because of the '|' synchronisation (clients do not start before all other clients of the thread are connected). With -C, the is no initial connection, the connections are managed within doCustom, which is doing transaction processing asynchronously. Client 1: CTCTCTCTCTCTCTCTCTCTCTCTCTCT- Client 2: -CTCTCTCTCTCTCTCTCTCTCTCTCTCT <--------------------------->whole execution time <--------------------------> measured connection time While a client is connecting, the other client is performing its transaction in an asynchronous manner, so the measured connection time may be arbitrary close to the execution time, this was the bug you detected. > In summary, I see no reason to change the v1 patch. I still think that my revised thinking is the right one... I hope that the above drawings make my thinking clearer. For me, the initial formula was the right one when not using -C, only the -C case need fixing. -- Fabien.
pgsql-hackers by date: