Thread: pgbench doc fix
pgbench doc (and some comments in pgbench.c) regarding "-M prepared" option is not quite correct. ------------------------------------------------------------------------ -M querymode --protocol=querymode Protocol to use for submitting queries to the server: simple: use simple query protocol. extended: use extended query protocol. prepared: use extended query protocol with prepared statements. ------------------------------------------------------------------------ Actually "extended" mode uses prepared statements too. The only difference is, in extended mode *unnamed* prepared statements are used, while in prepared mode *named* prepared statements are used. Also, in extended query protocol, prepared statements are always used anyway. Thus "use extended query protocol with prepared statements" does not give any useful information to users. I think this should be changed to: prepared: use extended query protocol with named prepared statements. Patch attached. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index b5e3a62a33..3a7fe25342 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -470,7 +470,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d <para><literal>extended</literal>: use extended query protocol.</para> </listitem> <listitem> - <para><literal>prepared</literal>: use extended query protocol with prepared statements.</para> + <para><literal>prepared</literal>: use extended query protocol with named prepared statements.</para> </listitem> </itemizedlist> The default is simple query protocol. (See <xref linkend="protocol"/> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 81bc6d8a6e..915f084e10 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -458,7 +458,7 @@ typedef enum QueryMode { QUERY_SIMPLE, /* simple query */ QUERY_EXTENDED, /* extended query */ - QUERY_PREPARED, /* extended query with prepared statements */ + QUERY_PREPARED, /* extended query with named prepared statements */ NUM_QUERYMODE } QueryMode;
Hello Tatsuo-san, > pgbench doc (and some comments in pgbench.c) regarding "-M prepared" > option is not quite correct. [...] Actually "extended" mode uses > prepared statements too. Ok, I understand that you mean that PQsendQueryParams uses an unamed query internally to separate parsing & execution, which seems indeed to be the case by looking at the libpq client-side code. However, if I'm not mistaken, the params version always sends and possibly reparses the query each time (is there a server side cache to avoid re-parsing? a quick scan in the sources did not return a clear answer to this question, but I think to recall that the answer is yes). > Patch attached. Patch applies cleanly, compiles, doc generation ok, global & local tests are ok. I'm fine having a more precise wording. Maybe I would have also insisted on the fact that there is an explicit vs an implicit PREPARE, if it relies on a server-side cache. The "extended" documentation entry does not say that it is prepared. I created an entry in the CF and marked the patch as ready anyway. -- Fabien.
Hi Fabien, > Ok, I understand that you mean that PQsendQueryParams uses an unamed > query internally to separate parsing & execution, which seems indeed > to be the case by looking at the libpq client-side code. > > However, if I'm not mistaken, the params version always sends and > possibly reparses the query each time (is there a server side cache to > avoid re-parsing? a quick scan in the sources did not return a clear > answer to this question, but I think to recall that the answer is > yes). Yes, you need to send params (thus send bind message) anyway. Regarding re-parsing, maybe you mixed up parse-analythis with planning? Re-parse-analythis can only be avoided if you can reuse named (or unnamed) parepared statements. As for planning, PostgreSQL could reuse plancache at the bind time if possible. See exec_bind_message() and GetCachePlan() for more details. BTW, "-M extended" calls PQsendQueryParams, which sends unnamed statements and unnamed portals: parse message (BEGIN) bind message (BEGIN) describe message (BEGIN) execute message (BEGIN) sync message parse message (UPDATE) bind message (UPDATE) describe message (UPDATE) execute message (UPDATE) sync message : : parse message (END) bind message (END) describe message (END) execute message (END) sync message (repeat for next transaction) While "-M prepared" calls PQsendPrepare + PQsendQueryParepared, which sends named statemenst and unnamed portals: [#1 transaction] parse message (BEGIN, statement = PO_1) bind message (BEGIN, statement = PO_1, portal = "") describe message (BEGIN, portal = "") execute message (BEGIN, portal = "") sync message parse message (UPDATE, statement = PO_5) bind message (UPDATE, statement = PO_5, portal = "") describe message (UPDATE, portal = "") execute message (UPDATE, portal = "") sync message : : parse message (END, statement = PO_10, portal = "") bind message (END, statement = PO_10, portal = "") describe message (END, portal = "") execute message (END, portal = "") sync message [#2 transaction] bind message (BEGIN, statement = PO_1 portal = "") describe message (BEGIN, portal = "") execute message (BEGIN, portal = "") sync message bind message (UPDATE, statement = PO_5, portal = "") describe message (UPDATE, portal = "") execute message (UPDATE, portal = "") sync message : : bind message (END, statement = PO_10, portal = "") describe message (END, portal = "") execute message (END, portal = "") sync message As you can see, with "-M prepared" we can save one parse message for each command step. This is an advantage to use named statements. >> Patch attached. > > Patch applies cleanly, compiles, doc generation ok, global & local > tests are ok. > > I'm fine having a more precise wording. > > Maybe I would have also insisted on the fact that there is an explicit > vs an implicit PREPARE, if it relies on a server-side cache. The Not sure what you mean. There's no PREPARE in extended queries (SQL PREPARE does exits of course). Probably you mean "parse" message in extended queires? If so, both "-M extended" and "-M prepared" use parse messages. > "extended" > documentation entry does not say that it is prepared. > > I created an entry in the CF and marked the patch as ready anyway. Thanks. BTW, as you can see, each command step above has "sync" message. This is pretty annoying because it hurts performance a lot, i.e. every time sync is received PostgreSQL needs to return all results at this point. Extended query is designed to issue only once per command set (parse, bind, describe and execute). This is not a fault of pgbench, rather of libpq (the sync message is issued inside libpq). This is a serious problem because libpq can be used by other language APIs as well, and those languages are also affected by the slowness of libpq. Probably we should redesign (or add) better APIs for extended queries someday. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On Tue, Oct 30, 2018 at 8:48 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > Yes, you need to send params (thus send bind message) anyway. > Regarding re-parsing, maybe you mixed up parse-analythis with > planning? Re-parse-analythis can only be avoided if you can reuse > named (or unnamed) parepared statements. So given this, I'm struggling to see anything wrong with the current wording. I mean, if you say that you are reusing prepared statements, someone will assume that you are avoiding preparing them repeatedly, which -M extended will not do ... and by the nature of that approach, cannot do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert, >> Yes, you need to send params (thus send bind message) anyway. >> Regarding re-parsing, maybe you mixed up parse-analythis with >> planning? Re-parse-analythis can only be avoided if you can reuse >> named (or unnamed) parepared statements. > > So given this, I'm struggling to see anything wrong with the current > wording. ISTM that the point is not that it is wrong, but it could be more precise. > I mean, if you say that you are reusing prepared statements, It does not say "reuse" explicitely, it says "prepared: use extended query protocol with prepared statements." but the extended protocol does always "prepare" statements before executing them, the difference are that with "-M prepared" (1) it is done just once and (2) named so that it can be indeed reused. Note that "extended" prepares much more statements than "prepared":-) > someone will assume that you are avoiding preparing them repeatedly, > which -M extended will not do ... and by the nature of that approach, > cannot do. Sure. At the protocol level "prepare" is slightly imprecise, and the documentation is about the protocol used. So I do not think a more precise wording harms. Maybe: "prepared: use extended query protocol with REUSED named prepared statements" would be even less slightly ambiguous. -- Fabien.
> So I do not think a more precise wording harms. Maybe: "prepared: use > extended query protocol with REUSED named prepared statements" would > be even less slightly ambiguous. I like this. But maybe we can remove "named"? "prepared: use extended query protocol with reused prepared statements" Because "named" prepared statements can be (unlike unnamed prepared statements) reused repeatably, it implies "reused". So using both "named" and "reused" sounds a little bit redundant to me. If we choose one of them, I prefer "reused" since it more explicitly stats the difference between "-M extended" and "-M prepared". Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On 03/11/2018 01:08, Tatsuo Ishii wrote: > I like this. But maybe we can remove "named"? > > "prepared: use extended query protocol with reused prepared statements" I don't think this mouthful is useful in the --help output. The existing wording gets the message across just fine, I think. More details can be put in the reference page. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> "prepared: use extended query protocol with reused prepared statements" > > I don't think this mouthful is useful in the --help output. The > existing wording gets the message across just fine, I think. More > details can be put in the reference page. These suggestions are for the online doc page, and possibly an internal comment in the code. The pgbench --help just states the 3 possible values and the say which is the default, and indeed cannot say much more. -- Fabien.
> On Sat, Nov 3, 2018 at 1:08 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > > > So I do not think a more precise wording harms. Maybe: "prepared: use > > extended query protocol with REUSED named prepared statements" would > > be even less slightly ambiguous. > > I like this. But maybe we can remove "named"? I also think it makes sense to adjust wording a bit here, and this version sounds good (taking into account the commentary about "named"). I'm moving this to the next CF, where the question would be if anyone from commiters can agree with this point.
On 30/11/2018 15:42, Dmitry Dolgov wrote: >> On Sat, Nov 3, 2018 at 1:08 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote: >> >>> So I do not think a more precise wording harms. Maybe: "prepared: use >>> extended query protocol with REUSED named prepared statements" would >>> be even less slightly ambiguous. >> >> I like this. But maybe we can remove "named"? > > I also think it makes sense to adjust wording a bit here, and this version > sounds good (taking into account the commentary about "named"). I'm moving this > to the next CF, where the question would be if anyone from commiters can agree > with this point. I don't see a concrete proposed patch here after the discussion. Reading the documentation again, we could go for much more detail here. For example, what's the point of having -M simple vs -M extended? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>> So I do not think a more precise wording harms. Maybe: "prepared: use >>>> extended query protocol with REUSED named prepared statements" would >>>> be even less slightly ambiguous. >>> >>> I like this. But maybe we can remove "named"? >> >> I also think it makes sense to adjust wording a bit here, and this version >> sounds good (taking into account the commentary about "named"). I'm moving this >> to the next CF, where the question would be if anyone from commiters can agree >> with this point. > > I don't see a concrete proposed patch here after the discussion. > > Reading the documentation again, we could go for much more detail here. > For example, what's the point of having -M simple vs -M extended? They do not use the same libpq-level approach (PQsendQuery vs PQsendQueryParams), so they are not exercising the same type of client? Pgbench is also about testing libpq performance. -- Fabien.
>>>>> So I do not think a more precise wording harms. Maybe: "prepared: use >>>>> extended query protocol with REUSED named prepared statements" would >>>>> be even less slightly ambiguous. >>>> >>>> I like this. But maybe we can remove "named"? >>> >>> I also think it makes sense to adjust wording a bit here, and this >>> version >>> sounds good (taking into account the commentary about "named"). I'm >>> moving this >>> to the next CF, where the question would be if anyone from commiters >>> can agree >>> with this point. >> >> I don't see a concrete proposed patch here after the discussion. >> >> Reading the documentation again, we could go for much more detail >> here. >> For example, what's the point of having -M simple vs -M extended? > > They do not use the same libpq-level approach (PQsendQuery vs > PQsendQueryParams), so they are not exercising the same type of > client? Pgbench is also about testing libpq performance. Yes, -M extended is pretty slow because for each query it needs to send parse/bind/describe/execute messages. -M prepared is much faster because for the second and subsequent iterations of query, it does not need to execute parse analysis, which means not only less message are exchanged but parse analysis are omitted for the second and subsequent query iterations. Here are quick test results using pgbench -S -M $mode -c 10 -T 30, where $mode is "simple", "extended" or "prepared" on my Ubuntu 18 laptop. The TPS numbers are average on each 3 trials for "execuding connections establishing". simpe: 48804.8383 TPS extended: 39735.4278 TPS prepared: 83459.2293 TPS So "prepared" is roughly 2x faster than "extended". Based on this, I would suggest to modify exiting descriptions in pgbench doc regarding "-M querymode": From: ------------------------------------------------------------------- -M querymode --protocol=querymode Protocol to use for submitting queries to the server: simple: use simple query protocol. extended: use extended query protocol. prepared: use extended query protocol with prepared statements. ------------------------------------------------------------------- To: ------------------------------------------------------------------- -M querymode --protocol=querymode Protocol to use for submitting queries to the server: simple: use simple query protocol. extended: use extended query protocol. prepared: use extended query protocol with prepared statements. Because in "prepared" mode pgbench reuses the parse analysis result for the second and subsequent query iteration, pgbench runs faster in the prepared mode than in other modes. ------------------------------------------------------------------- Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index b5e3a62a33..c2a2ff9707 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -473,6 +473,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d <para><literal>prepared</literal>: use extended query protocol with prepared statements.</para> </listitem> </itemizedlist> + + Because in "prepared" mode <application>pgbench</application> reuses + the parse analysis result for the second and subsequent query + iteration, <application>pgbench</application> runs faster in the + prepared mode than in other modes. + </para> + <para> The default is simple query protocol. (See <xref linkend="protocol"/> for more information.) </para>
On 2018-Dec-03, Tatsuo Ishii wrote: > To: > ------------------------------------------------------------------- > -M querymode > --protocol=querymode > > Protocol to use for submitting queries to the server: > > simple: use simple query protocol. > > extended: use extended query protocol. > > prepared: use extended query protocol with prepared statements. > > Because in "prepared" mode pgbench reuses the parse analysis > result for the second and subsequent query iteration, pgbench runs > faster in the prepared mode than in other modes. Looks good to me. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On 2018-Dec-03, Tatsuo Ishii wrote: > >> To: >> ------------------------------------------------------------------- >> -M querymode >> --protocol=querymode >> >> Protocol to use for submitting queries to the server: >> >> simple: use simple query protocol. >> >> extended: use extended query protocol. >> >> prepared: use extended query protocol with prepared statements. >> >> Because in "prepared" mode pgbench reuses the parse analysis >> result for the second and subsequent query iteration, pgbench runs >> faster in the prepared mode than in other modes. > > Looks good to me. Thanks. I'm going to commit this if there's no objection. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
>> On 2018-Dec-03, Tatsuo Ishii wrote: >> >>> To: >>> ------------------------------------------------------------------- >>> -M querymode >>> --protocol=querymode >>> >>> Protocol to use for submitting queries to the server: >>> >>> simple: use simple query protocol. >>> >>> extended: use extended query protocol. >>> >>> prepared: use extended query protocol with prepared statements. >>> >>> Because in "prepared" mode pgbench reuses the parse analysis >>> result for the second and subsequent query iteration, pgbench runs >>> faster in the prepared mode than in other modes. >> >> Looks good to me. > > Thanks. I'm going to commit this if there's no objection. Done. Thanks. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp