Thread: Patch to show individual statement latencies in pgbench output
Hi To be able to asses the performance characteristics of the different wal-related options, I patched pgbench to show the averagelatency of each individual statement. The idea is to be able to compare the latency of the COMMIT with the ones ofthe other statements. This patch adds two new finds to the Command struct, elapsed and cnt, which accumulate the total time spent on each statementand the number of times it was executed. printResults essentially prings elapsed/cnt for each statement, followedby the statement itself. Patch is attached. best regards, Florian Pflug
Attachment
Florian Pflug wrote: > To be able to asses the performance characteristics of the different wal-related options, I patched pgbench to show theaverage latency of each individual statement. The idea is to be able to compare the latency of the COMMIT with the onesof the other statements. > That's an interesting idea, particularly given that people don't really understand where the time is going in the standard pgbench test. Your first bit of review feedback is that this would have to be something you could toggle on and off, there's no way most people want to pay this penalty. If you submit a new patch with a command line option to enable this alternate logging format and add the result to https://commitfest.postgresql.org/action/commitfest_view?id=6 , you can put my name down as a reviewer and I'll take a deeper look at it as part of that. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Jun 14, 2010, at 2:22 , Greg Smith wrote: > Florian Pflug wrote: >> To be able to asses the performance characteristics of the different wal-related options, I patched pgbench to show theaverage latency of each individual statement. The idea is to be able to compare the latency of the COMMIT with the onesof the other statements. > > That's an interesting idea, particularly given that people don't really understand where the time is going in the standardpgbench test. Your first bit of review feedback is that this would have to be something you could toggle on andoff, there's no way most people want to pay this penalty. If you submit a new patch with a command line option to enablethis alternate logging format and add the result to https://commitfest.postgresql.org/action/commitfest_view?id=6 ,you can put my name down as a reviewer and I'll take a deeper look at it as part of that. Thank for offering to review this! Here is an updated patch that adds a command-line option (-r) to enable/disable per-command latency reporting. It also usesthe INSTR_TIME infrastructure that 9.0 provides, and should work correctly in multi-threaded mode. Data is collectedper-thread and summarized only for reporting to avoid locking overhead. It now shows all lines for the SQL scriptstogether with their latencies (zero for comments), not only those containing SQL commands. I'll add this patch to the next commitfest, and put you down as a reviewer, as you suggested. best regards, Florian Pflug
Attachment
Finally got around to taking a longer look at your patch, sorry about the delay here. Patch itself seems to work on simple tests anyway (more on the one suspect bit below). You didn't show what the output looks like, so let's start with that because it is both kind of neat and not what I expected from your description. Here's the sort of extra stuff added to the end of the standard output when you toggle this feature on: $ pgbench -S pgbench -T 10 -c 8 -j 4 -l -r ... tps = 28824.943954 (excluding connections establishing) command latencies 0.001487 \set naccounts 100000 * :scale 0.000677 \setrandom aid 1 :naccounts 0.273983 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; From the way you described the patch, I had thought that you were just putting this information into the log files or something like that. In fact, it's not in the log files; it just shows up in this summary at the end. I'm not sure if that's best or not. Some might want to see how the per-statement variation varies over time. Sort of torn on whether the summary alone is enough detail or not. Let me play with this some more and get back to you on that. Here's what a standard test looks like: tps = 292.468349 (excluding connections establishing) command latencies 0.002120 \set nbranches 1 * :scale 0.000682 \set ntellers 10 * :scale 0.000615 \set naccounts 100000 * :scale 0.000723 \setrandom aid 1 :naccounts 0.000522 \setrandom bid 1 :nbranches 0.000553 \setrandom tid 1 :ntellers 0.000525 \setrandom delta -5000 5000 0.070307 BEGIN; 1.721631 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.147854 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 11.894366 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 4.761715 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.643895 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 7.452017 END; I'm happy to see that the END here has a significant portion of time assigned to it, which means that it's correctly tracking the commit that happens there. It's possible to ask the question "why add this feature when pg_stat_statements will get you the same data?". I think this gives a different enough view of things that it's worth adding anyway. Getting the END statements tracked, not having to use prepared statements to make it work, and now having to worry about overflowing the pg_stat_statements.max parameter that limits what that module tracks are all points in favor of this patch being useful even if you know about pg_stat_statements. Now onto the nitpicking. With the standard Ubuntu compiler warnings on I get this: pgbench.c:1588: warning: ‘i’ may be used uninitialized in this function If you didn't see that, you may want to double-check how verbose you have your compiler setup to be; maybe you just missed it (which is of course what reviewers are here for). Regardless, the troublesome bit is this: int i; commands = process_commands(&buf[i]); Which is obviously not a good thing. I'm not sure entirely what you're doing with the changes you made to process_file, but I'd suggest you double check the logic and coding of that section because there's at least one funny thing going on here with i being used without initialization first here. I didn't try yet to figure out how this error might lead to a bug, but there probably is one. This looks like a good feature to me, just not sure if it's worth extending to produce even more output if people want to see it. Can always layer that on top later. I'll continue testing and try to get a firmer opinion. Please take a look at the problem I pointed out and produce a new patch when you get a chance that fixes that part, so at least we don't get stuck on that detail. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Jul29, 2010, at 00:48 , Greg Smith wrote: > Finally got around to taking a longer look at your patch, sorry about the delay here. Patch itself seems to work on simpletests anyway (more on the one suspect bit below). You didn't show what the output looks like, so let's start with thatbecause it is both kind of neat and not what I expected from your description. Here's the sort of extra stuff added tothe end of the standard output when you toggle this feature on: > > <snipped output> > > From the way you described the patch, I had thought that you were just putting this information into the log files or somethinglike that. In fact, it's not in the log files; it just shows up in this summary at the end. I'm not sure if that'sbest or not. Some might want to see how the per-statement variation varies over time. Sort of torn on whether the summaryalone is enough detail or not. Let me play with this some more and get back to you on that. I think the two features are pretty much orthogonal, even though they'd make use of the same per-statement instrumentationmachinery. I created the patch to tune the wal_writer for the synchronous_commit=off case - the idea being that the COMMIT should bevirtually instantaneous if the wal_writer writes old wal buffers out fast enough. I haven't yet used pgbench's log output feature, so I can't judge how useful the additional of per-statement data to thatlog is, and what the format should be. However, if you think it's useful and can come up with a sensible format, I'dbe happy to add that feature to the patch. > Now onto the nitpicking. With the standard Ubuntu compiler warnings on I get this: > > pgbench.c:1588: warning: ‘i’ may be used uninitialized in this function > > If you didn't see that, you may want to double-check how verbose you have your compiler setup to be; maybe you just missedit (which is of course what reviewers are here for). Regardless, the troublesome bit is this: > > int i; > > commands = process_commands(&buf[i]); Fixed. That was a leftover of the trimming and comment skipping logic, which my patch moves to process_command. Updated patch is attached. Thanks for your extensive review & best regards, Florian Pflug
Attachment
Florian Pflug wrote: > I created the patch to tune the wal_writer for the synchronous_commit=off case - the idea being that the COMMIT shouldbe virtually instantaneous if the wal_writer writes old wal buffers out fast enough. > As I was saying, being able to see the COMMIT times for purposes such as that is something I consider valuable about using this instrumentation that's not really targeted by pg_stat_statements, the other way built-in way people might try to get it. > I haven't yet used pgbench's log output feature, so I can't judge how useful the additional of per-statement data to thatlog is, and what the format should be. However, if you think it's useful and can come up with a sensible format, I'dbe happy to add that feature to the patch. > Let's worry about that in the future. Maybe it's a good add-on, but it's more than I have time to get into during this CF personally. > That was a leftover of the trimming and comment skipping logic, which my patch moves to process_command. > I think there's still a trimming error here--line 195 of the new patch is now removing the declaration of "i" just before it sets it to zero? On the coding standard side, I noticed all your for loops are missing a space between the for and the (; that should get fixed. Finally, now that the rest of the patch is looking in good shape and is something I think is worth considering to commit, it's time to work on the documentation SGML. Also: when generating multiple versions of a patch like this, standard practice is to add something like "-vX" to the naming, so those of us trying to review can keep them straight. So next one would be "pgbench_statementlatency_v3.patch" or something like that. It's a good habit to get into from first version of a patch you submit. Presuming that's going to be the only version is optimistic for all but the smallest of patches, and sometimes not even them... -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Aug3, 2010, at 21:16 , Greg Smith wrote: >> That was a leftover of the trimming and comment skipping logic, which my patch moves to process_command. > > I think there's still a trimming error here--line 195 of the new patch is now removing the declaration of "i" just beforeit sets it to zero? Hm, I think it's just the diff thats miss-leading there. It correctly marks the "int i" line as "removed" with a "-", butfor some reason marks the "i = 0" line (and its successors) with a "!", although they're removed too, and not modified. > On the coding standard side, I noticed all your for loops are missing a space between the for and the (; that should getfixed. Fixed > > Finally, now that the rest of the patch is looking in good shape and is something I think is worth considering to commit,it's time to work on the documentation SGML. I've added the "-r" option to the list of pgbench options in pgbench.sgml and also added a short section that shows how theoutput looks like, similar to how things are done for the "-l" option. > Also: when generating multiple versions of a patch like this, standard practice is to add something like "-vX" to thenaming, so those of us trying to review can keep them straight. Will do from now on. Updated patch is attached. I've also pushed this as branch "pgbench_statementlatency" to git://github.com/fgp/postgres.git best regards, Florian Pflug
Attachment
On Aug4, 2010, at 13:58 , Florian Pflug wrote: > On Aug3, 2010, at 21:16 , Greg Smith wrote: >>> That was a leftover of the trimming and comment skipping logic, which my patch moves to process_command. >> >> I think there's still a trimming error here--line 195 of the new patch is now removing the declaration of "i" just beforeit sets it to zero? > Hm, I think it's just the diff thats miss-leading there. It correctly marks the "int i" line as "removed" with a "-", butfor some reason marks the "i = 0" line (and its successors) with a "!", although they're removed too, and not modified. > >> On the coding standard side, I noticed all your for loops are missing a space between the for and the (; that should getfixed. > Fixed Crap. I've messed up to the effect that the for-loop formatting fix wasn't actually in the patch. Attached is an updated version (v4). Sorry for the noise. best regards, Florian Pflug
Attachment
Florian Pflug wrote: > Attached is an updated version (v4). > I've attached a v5. No real code changes from Florian's version, just some wording/style fixes and rework on the documentation. The user side is now consistent about calling these statement latencies for example, even though the internals still call them command latencies most places. Since using this new feature will introduce a whole stack of new calls to get the system time, I also added a warning about that impacting results: Note that collecting the additional timing information needed for detailed latency computation does add some overhead. This will slow average execution speed and lower the computed TPS. The exact amount of slowdown varies significantly based on platform and hardware. Comparing average TPS values with and without latency reporting enabled is a good way to measure if the timing overhead is significant. I wasn't able to see any significant slowdown on my modern Linux systems doing such a test: $ ./pgbench -T 10 -S -c 8 -j 4 pgbench tps = 6716.039813 (including connections establishing) tps = 6720.238878 (excluding connections establishing) $ ./pgbench -T 10 -S -c 8 -j 4 -r pgbench tps = 6708.544618 (including connections establishing) tps = 6712.728526 (excluding connections establishing) But I know gettimeofday is fast here. Worth including a warning for though I think. I'm out of things to check here, marking this one ready for a committer review. The patch hasn't had a committer assigned yet, so whoever wants to claim it should mark the CF app. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 53cc4b6..90d11f1 100644 *** a/contrib/pgbench/pgbench.c --- b/contrib/pgbench/pgbench.c *************** int fillfactor = 100; *** 133,138 **** --- 133,139 ---- bool use_log; /* log transaction latencies to a file */ bool is_connect; /* establish connection for each transaction */ + bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ char *pghost = ""; *************** typedef struct *** 171,177 **** int64 until; /* napping until (usec) */ Variable *variables; /* array of variable definitions */ int nvariables; ! instr_time txn_begin; /* used for measuring latencies */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; } CState; --- 172,179 ---- int64 until; /* napping until (usec) */ Variable *variables; /* array of variable definitions */ int nvariables; ! instr_time txn_begin; /* used for measuring transaction latencies */ ! instr_time stmt_begin; /* used for measuring statement latencies */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; } CState; *************** typedef struct *** 199,204 **** --- 201,207 ---- /* * queries read from files */ + #define COMMENT_COMMAND 0 #define SQL_COMMAND 1 #define META_COMMAND 2 #define MAX_ARGS 10 *************** static const char *QUERYMODE[] = {"simpl *** 216,224 **** --- 219,230 ---- typedef struct { + char *line; /* line containing the command */ int type; /* command type (SQL_COMMAND or META_COMMAND) */ int argc; /* number of commands */ char *argv[MAX_ARGS]; /* command list */ + instr_time *exec_elapsed; /* time spent executing this command (per thread) */ + int *exec_count; /* number of executions (per thread) */ } Command; static Command **sql_files[MAX_FILES]; /* SQL script files */ *************** usage(const char *progname) *** 287,292 **** --- 293,299 ---- " define variable for use by custom script\n" " -f FILENAME read transaction script from FILENAME\n" " -j NUM number of threads (default: 1)\n" + " -r report average latency per command\n" " -l write transaction times to log file\n" " -M {simple|extended|prepared}\n" " protocol for submitting queries to server (default: simple)\n" *************** clientDone(CState *st, bool ok) *** 740,746 **** /* return false iff client should be disconnected */ static bool ! doCustom(CState *st, instr_time *conn_time, FILE *logfile) { PGresult *res; Command **commands; --- 747,753 ---- /* return false iff client should be disconnected */ static bool ! doCustom(TState* thread, CState *st, instr_time *conn_time, FILE *logfile) { PGresult *res; Command **commands; *************** top: *** 800,805 **** --- 807,824 ---- #endif } + /* Accumulate per-command execution times if per-command latencies + are requested */ + if (is_latencies && (commands[st->state]->type != COMMENT_COMMAND)) + { + instr_time now; + + INSTR_TIME_SET_CURRENT(now); + INSTR_TIME_ACCUM_DIFF(commands[st->state]->exec_elapsed[thread->tid], + now, st->stmt_begin); + ++(commands[st->state]->exec_count[thread->tid]); + } + if (commands[st->state]->type == SQL_COMMAND) { res = PQgetResult(st->con); *************** top: *** 840,845 **** --- 859,871 ---- commands = sql_files[st->use_file]; } } + + /* Skip comments */ + if (commands[st->state]->type == COMMENT_COMMAND) + { + st->listen = 1; + goto top; + } if (st->con == NULL) { *************** top: *** 856,864 **** --- 882,895 ---- INSTR_TIME_ACCUM_DIFF(*conn_time, end, start); } + /* Record transaction start time if logging is enabled */ if (logfile && st->state == 0) INSTR_TIME_SET_CURRENT(st->txn_begin); + /* Record statement start time if per-command latencies are requested */ + if (is_latencies) + INSTR_TIME_SET_CURRENT(st->stmt_begin); + if (commands[st->state]->type == SQL_COMMAND) { const Command *command = commands[st->state]; *************** process_commands(char *buf) *** 1361,1385 **** char *p, *tok; if ((p = strchr(buf, '\n')) != NULL) *p = '\0'; ! p = buf; ! while (isspace((unsigned char) *p)) ! p++; ! ! if (*p == '\0' || strncmp(p, "--", 2) == 0) { return NULL; } ! my_commands = (Command *) malloc(sizeof(Command)); ! if (my_commands == NULL) { ! return NULL; } ! my_commands->argc = 0; if (*p == '\\') { --- 1392,1428 ---- char *p, *tok; + /* Make the string buf end at the next newline */ if ((p = strchr(buf, '\n')) != NULL) *p = '\0'; ! /* Allocate and initialize Command structure, initially ! * assuming the line is a comment ! */ ! my_commands = (Command *) malloc(sizeof(Command)); ! if (my_commands == NULL) { return NULL; } + my_commands->type = COMMENT_COMMAND; + my_commands->line = strdup(buf); + my_commands->argc = 0; + my_commands->exec_elapsed = NULL; + my_commands->exec_count = NULL; ! p = buf; ! ! /* Skip leading whitespace */ ! while (isspace((unsigned char) *p)) { ! p++; } ! /* If the line is empty or actually a comment, we're done */ ! if (*p == '\0' || strncmp(p, "--", 2) == 0) ! { ! return my_commands; ! } if (*p == '\\') { *************** process_file(char *filename) *** 1548,1570 **** while (fgets(buf, sizeof(buf), fd) != NULL) { Command *commands; - int i; ! i = 0; ! while (isspace((unsigned char) buf[i])) ! i++; ! ! if (buf[i] != '\0' && strncmp(&buf[i], "--", 2) != 0) { ! commands = process_commands(&buf[i]); ! if (commands == NULL) ! { ! fclose(fd); ! return false; ! } } - else - continue; my_commands[lineno] = commands; lineno++; --- 1591,1603 ---- while (fgets(buf, sizeof(buf), fd) != NULL) { Command *commands; ! commands = process_commands(buf); ! if (commands == NULL) { ! fclose(fd); ! return false; } my_commands[lineno] = commands; lineno++; *************** printResults(int ttype, int normal_xacts *** 1660,1665 **** --- 1693,1699 ---- tps_include, tps_exclude; char *s; + int i; time_include = INSTR_TIME_GET_DOUBLE(total_time); tps_include = normal_xacts / time_include; *************** printResults(int ttype, int normal_xacts *** 1694,1699 **** --- 1728,1771 ---- } printf("tps = %f (including connections establishing)\n", tps_include); printf("tps = %f (excluding connections establishing)\n", tps_exclude); + + /* Report per-command latencies */ + if (is_latencies) + { + for (i = 0; i < num_files; ++i) { + Command** commands; + + if (num_files > 1) + printf("statement latencies in milliseconds, file %d\n:", i+1); + else + printf("statement latencies in milliseconds:\n"); + + for (commands = sql_files[i]; *commands != NULL; ++commands) + { + Command* command = *commands; + instr_time total_exec_elapsed; + int total_exec_count; + int t; + + /* Accumulate per-thread data if available */ + INSTR_TIME_SET_ZERO(total_exec_elapsed); + total_exec_count = 0; + if ((command->exec_elapsed != NULL) && (command->exec_count != NULL)) + { + for (t = 0; t < nthreads; ++t) + { + INSTR_TIME_ADD(total_exec_elapsed, command->exec_elapsed[t]); + total_exec_count += command->exec_count[t]; + } + } + + if (total_exec_count > 0) + printf("\t%f\t%s\n", INSTR_TIME_GET_MILLISEC(total_exec_elapsed) / (double)total_exec_count, command->line); + else + printf("\t%f\t%s\n", 0.0, command->line); + } + } + } } *************** main(int argc, char **argv) *** 1770,1776 **** memset(state, 0, sizeof(*state)); ! while ((c = getopt(argc, argv, "ih:nvp:dSNc:Cs:t:T:U:lf:D:F:M:j:")) != -1) { switch (c) { --- 1842,1848 ---- memset(state, 0, sizeof(*state)); ! while ((c = getopt(argc, argv, "ih:nvp:dSNc:Cs:t:T:U:rlf:D:F:M:j:")) != -1) { switch (c) { *************** main(int argc, char **argv) *** 1834,1839 **** --- 1906,1914 ---- case 'C': is_connect = true; break; + case 'r': + is_latencies = true; + break; case 's': scale_given = true; scale = atoi(optarg); *************** main(int argc, char **argv) *** 2090,2095 **** --- 2165,2193 ---- default: break; } + + /* Reserve memory for the threads to store per-command latencies if requested */ + if (is_latencies) + { + for (i = 0; i < num_files; ++i) + { + Command** commands; + for (commands = sql_files[i]; *commands != NULL; ++commands) + { + Command* command = *commands; + int t; + + command->exec_elapsed = (instr_time*) malloc(sizeof(instr_time) * nthreads); + command->exec_count = (int*) malloc(sizeof(int) * nthreads); + + for (t = 0; t < nthreads; ++t) + { + INSTR_TIME_SET_ZERO(command->exec_elapsed[t]); + command->exec_count[t] = 0; + } + } + } + } /* get start up time */ INSTR_TIME_SET_CURRENT(start_time); *************** threadRun(void *arg) *** 2211,2217 **** int prev_ecnt = st->ecnt; st->use_file = getrand(0, num_files - 1); ! if (!doCustom(st, &result->conn_time, logfile)) remains--; /* I've aborted */ if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND) --- 2309,2315 ---- int prev_ecnt = st->ecnt; st->use_file = getrand(0, num_files - 1); ! if (!doCustom(thread, st, &result->conn_time, logfile)) remains--; /* I've aborted */ if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND) *************** threadRun(void *arg) *** 2260,2266 **** { continue; } ! else if (commands[st->state]->type == META_COMMAND) { min_usec = 0; /* the connection is ready to run */ break; --- 2358,2364 ---- { continue; } ! else if ((commands[st->state]->type == META_COMMAND) || (commands[st->state]->type == COMMENT_COMMAND)) { min_usec = 0; /* the connection is ready to run */ break; *************** threadRun(void *arg) *** 2313,2319 **** if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask) || commands[st->state]->type == META_COMMAND)) { ! if (!doCustom(st, &result->conn_time, logfile)) remains--; /* I've aborted */ } --- 2411,2417 ---- if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask) || commands[st->state]->type == META_COMMAND)) { ! if (!doCustom(thread, st, &result->conn_time, logfile)) remains--; /* I've aborted */ } diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml index cdad212..6877bec 100644 *** a/doc/src/sgml/pgbench.sgml --- b/doc/src/sgml/pgbench.sgml *************** pgbench <optional> <replaceable>options< *** 247,252 **** --- 247,263 ---- </varlistentry> <varlistentry> + <term><option>-r</option></term> + <listitem> + <para> + Report the average per-statement latency (execution time from the + perspective of the client) of each command after the benchmark + finishes. See below for details. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-M</option> <replaceable>querymode</></term> <listitem> <para> *************** END; *** 632,638 **** <replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>file_no</> <replaceable>time_epoch</><replaceable>time_us</> </synopsis> ! where <replaceable>time</> is the elapsed transaction time in microseconds, <replaceable>file_no</> identifies which script file was used (useful when multiple scripts were specified with <literal>-f</>), and <replaceable>time_epoch</>/<replaceable>time_us</> are a --- 643,649 ---- <replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>file_no</> <replaceable>time_epoch</><replaceable>time_us</> </synopsis> ! where <replaceable>time</> is the total elapsed transaction time in microseconds, <replaceable>file_no</> identifies which script file was used (useful when multiple scripts were specified with <literal>-f</>), and <replaceable>time_epoch</>/<replaceable>time_us</> are a *************** END; *** 654,659 **** --- 665,742 ---- </sect2> <sect2> + <title>Per-statement latencies</title> + <para> + With the <literal>-r</> option, <application>pgbench</> collects + the elapsed transaction time of each statement executed by every + client. It then reports an average of those values, referred to + as the latency for each statement, after the benchmark has finished. + </para> + + <para> + For the default script, the output will look similar to this: + <screen> + starting vacuum...end. + transaction type: TPC-B (sort of) + scaling factor: 1 + query mode: simple + number of clients: 10 + number of threads: 1 + number of transactions per client: 1000 + number of transactions actually processed: 10000/10000 + tps = 618.764555 (including connections establishing) + tps = 622.977698 (excluding connections establishing) + statement latencies in milliseconds: + 0.004386 \set nbranches 1 * :scale + 0.001343 \set ntellers 10 * :scale + 0.001212 \set naccounts 100000 * :scale + 0.001310 \setrandom aid 1 :naccounts + 0.001073 \setrandom bid 1 :nbranches + 0.001005 \setrandom tid 1 :ntellers + 0.001078 \setrandom delta -5000 5000 + 0.326152 BEGIN; + 0.603376 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; + 0.454643 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; + 5.528491 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; + 7.335435 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; + 0.371851 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); + 1.212976 END; + </screen> + </para> + + <para> + If multiple script files are specified, the averages are reported + separately for each individual script file: + <screen> + starting vacuum...end. + transaction type: Custom query + scaling factor: 1 + query mode: simple + number of clients: 10 + number of threads: 1 + number of transactions per client: 1000 + number of transactions actually processed: 10000/10000 + tps = 162.902251 (including connections establishing) + tps = 163.166833 (excluding connections establishing) + statement latencies in milliseconds, file 1: + 2.534675 SELECT * FROM pgbench_branches LIMIT 1; + 1.168928 SELECT COUNT(*) FROM pgbench_branches; + statement latencies in milliseconds, file 2: + 2.394322 SELECT * FROM pgbench_accounts LIMIT 1; + 113.624613 SELECT COUNT(*) FROM pgbench_accounts; + </screen> + </para> + <para> + Note that collecting the additional timing information needed for + detailed latency computation does add some overhead. This will slow + average execution speed and lower the computed TPS. The exact amount + of slowdown varies significantly based on platform and hardware. + Comparing average TPS values with and without latency reporting enabled + is a good way to measure if the timing overhead is significant. + </para> + </sect2> + + <sect2> <title>Good Practices</title> <para>
Greg Smith <greg@2ndquadrant.com> writes: > Florian Pflug wrote: >> Attached is an updated version (v4). > I've attached a v5. No real code changes from Florian's version, just > some wording/style fixes and rework on the documentation. I'm looking through this patch now. It looks mostly good, but I am wondering just exactly what is the rationale for adding comment statements to the data structures, rather than ignoring them as before. It seems like a complete waste of logic, memory space, and cycles; moreover it renders the documentation's statement that comments "are ignored" incorrect. I did not find anything in the patch history explaining the point of that change. regards, tom lane
Greg Smith <greg@2ndquadrant.com> writes: > Florian Pflug wrote: >> Attached is an updated version (v4). > I've attached a v5. BTW, I discovered a rather nasty shortcoming of this patch on platforms without ENABLE_THREAD_SAFETY. It doesn't work, and what's worse, it *looks* like it's working, because it gives you plausible-looking numbers. But actually the numbers are only averages across the first worker thread. The other threads are in sub-processes where they can't affect the contents of the parent's arrays. Since platforms without ENABLE_THREAD_SAFETY are only a small minority these days, this is probably not sufficient reason to reject the patch. What I plan to do instead is reject the combination of -r with -j larger than 1 on such platforms: if (is_latencies) { /* * is_latencies only works with multiple threads in thread-based * implementations,not fork-based ones, because it supposes that the * parent can see changes made to the command datastructures by child * threads. It seems useful enough to accept despite this limitation, * but perhapswe should FIXME someday. */ #ifndef ENABLE_THREAD_SAFETY if (nthreads > 1) { fprintf(stderr, "-r does not work with -j larger than1 on this platform.\n"); exit(1); } #endif It could be fixed with enough effort, by having the child threads pass back their numbers through the child-to-parent pipes. I'm not interested in doing that myself though. If anyone thinks this problem makes it uncommittable, speak up now. regards, tom lane
Greg Smith <greg@2ndquadrant.com> writes: > I've attached a v5. No real code changes from Florian's version, just > some wording/style fixes and rework on the documentation. I've committed this with some editorialization. The main non-cosmetic change was that I pulled the latency statistics counters out of the per-Command data structures and put them into per-thread arrays instead. I did this for two reasons: 1. Having different threads munging adjacent array entries without any locking makes me itch. On some platforms that could possibly fail entirely, and in any case it's likely to be a performance hit on machines where processors lock whole cache lines (which is most of them these days, I think). 2. It should make it a lot easier to pass the per-thread results back up to the parent in a fork-based implementation, should anyone desire to fix the limitation I mentioned before. regards, tom lane
On Aug12, 2010, at 19:48 , Tom Lane wrote: > Greg Smith <greg@2ndquadrant.com> writes: >> Florian Pflug wrote: >>> Attached is an updated version (v4). > >> I've attached a v5. No real code changes from Florian's version, just >> some wording/style fixes and rework on the documentation. > > I'm looking through this patch now. It looks mostly good, but I am > wondering just exactly what is the rationale for adding comment > statements to the data structures, rather than ignoring them as before. > It seems like a complete waste of logic, memory space, and cycles; > moreover it renders the documentation's statement that comments > "are ignored" incorrect. I did not find anything in the patch history > explaining the point of that change. To be able to include the comments (with an average latency of zero) in the latency report. This makes the latency reportas self-explanatory as the original script was (Think latency report copy-and-pasted into an e-mail or wiki). It alsohas the benefit of making the line numbers of the latency report agree to those of the original script, which seemedlike a natural thing to do, and might make some sorts of post-processing easier. It does make doCustom() a bit morecomplex, though. Anyway, I guess the chance of adding this back is slim now that the patch is committed. Oh well. Thanks for committing this, and best regards, Florian Pflug
Tom Lane wrote: > It could be fixed with enough effort, by having the child threads pass > back their numbers through the child-to-parent pipes. I'm not > interested in doing that myself though. > That does seem an improvement with a very limited user base it's applicable to. Probably the next useful improvement to this feature is to get per-statement data into the latency log files if requested. If this issue gets in the way there somehow, maybe it's worth squashing then. I don't think it will though. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Florian Pflug <fgp@phlo.org> writes: > On Aug12, 2010, at 19:48 , Tom Lane wrote: >> I'm looking through this patch now. It looks mostly good, but I am >> wondering just exactly what is the rationale for adding comment >> statements to the data structures, rather than ignoring them as before. > To be able to include the comments (with an average latency of zero) > in the latency report. This makes the latency report as > self-explanatory as the original script was (Think latency report > copy-and-pasted into an e-mail or wiki). It also has the benefit of > making the line numbers of the latency report agree to those of the > original script, which seemed like a natural thing to do, and might > make some sorts of post-processing easier. It does make doCustom() a > bit more complex, though. I had wondered if the original form of the patch printed line numbers rather than the actual line contents. If that were true then it'd make sense to include comment lines. In the current form of the patch, though, I think the output is just as readable without comment lines --- and I'm not thrilled with having this patch materially affect the behavior for cases where -r wasn't even specified. regards, tom lane