Thread: pgbench doc fix

pgbench doc fix

From

Tatsuo Ishii

Date:

30 October 2018, 01:36:54

pgbench doc (and some comments in pgbench.c) regarding "-M prepared"
option is not quite correct.

------------------------------------------------------------------------
-M querymode
--protocol=querymode

    Protocol to use for submitting queries to the server:

        simple: use simple query protocol.

        extended: use extended query protocol.

        prepared: use extended query protocol with prepared statements.
------------------------------------------------------------------------

Actually "extended" mode uses prepared statements too. The only
difference is, in extended mode *unnamed* prepared statements are
used, while in prepared mode *named* prepared statements are used.

Also, in extended query protocol, prepared statements are always used
anyway. Thus "use extended query protocol with prepared statements"
does not give any useful information to users.

I think this should be changed to:

        prepared: use extended query protocol with named prepared statements.

Patch attached.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..3a7fe25342 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -470,7 +470,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
             <para><literal>extended</literal>: use extended query protocol.</para>
            </listitem>
            <listitem>
-            <para><literal>prepared</literal>: use extended query protocol with prepared statements.</para>
+            <para><literal>prepared</literal>: use extended query protocol with named prepared statements.</para>
            </listitem>
           </itemizedlist>
         The default is simple query protocol.  (See <xref linkend="protocol"/>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8a6e..915f084e10 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -458,7 +458,7 @@ typedef enum QueryMode
 {
     QUERY_SIMPLE,                /* simple query */
     QUERY_EXTENDED,                /* extended query */
-    QUERY_PREPARED,                /* extended query with prepared statements */
+    QUERY_PREPARED,                /* extended query with named prepared statements */
     NUM_QUERYMODE
 } QueryMode;

Re: pgbench doc fix

From

Fabien COELHO

Date:

30 October 2018, 08:57:09

Hello Tatsuo-san,

> pgbench doc (and some comments in pgbench.c) regarding "-M prepared" 
> option is not quite correct. [...] Actually "extended" mode uses 
> prepared statements too.

Ok, I understand that you mean that PQsendQueryParams uses an unamed query 
internally to separate parsing & execution, which seems indeed to be the 
case by looking at the libpq client-side code.

However, if I'm not mistaken, the params version always sends and possibly 
reparses the query each time (is there a server side cache to avoid 
re-parsing? a quick scan in the sources did not return a clear answer to 
this question, but I think to recall that the answer is yes).

> Patch attached.

Patch applies cleanly, compiles, doc generation ok, global & local tests 
are ok.

I'm fine having a more precise wording.

Maybe I would have also insisted on the fact that there is an explicit vs 
an implicit PREPARE, if it relies on a server-side cache. The "extended"
documentation entry does not say that it is prepared.

I created an entry in the CF and marked the patch as ready anyway.

-- 
Fabien.

Re: pgbench doc fix

From

Tatsuo Ishii

Date:

30 October 2018, 12:48:15

Hi Fabien,

> Ok, I understand that you mean that PQsendQueryParams uses an unamed
> query internally to separate parsing & execution, which seems indeed
> to be the case by looking at the libpq client-side code.
> 
> However, if I'm not mistaken, the params version always sends and
> possibly reparses the query each time (is there a server side cache to
> avoid re-parsing? a quick scan in the sources did not return a clear
> answer to this question, but I think to recall that the answer is
> yes).

Yes, you need to send params (thus send bind message) anyway.
Regarding re-parsing, maybe you mixed up parse-analythis with
planning? Re-parse-analythis can only be avoided if you can reuse
named (or unnamed) parepared statements. As for planning, PostgreSQL
could reuse plancache at the bind time if possible. See
exec_bind_message() and GetCachePlan() for more details.

BTW, "-M extended" calls PQsendQueryParams, which sends unnamed
statements and unnamed portals:

parse message (BEGIN)
bind message (BEGIN)
describe message (BEGIN)
execute message (BEGIN)
sync message

parse message (UPDATE)
bind message (UPDATE)
describe message (UPDATE)
execute message (UPDATE)
sync message
:
:
parse message (END)
bind message (END)
describe message (END)
execute message (END)
sync message

(repeat for next transaction)

While "-M prepared" calls PQsendPrepare + PQsendQueryParepared, which
sends named statemenst and unnamed portals:

[#1 transaction]

parse message (BEGIN, statement = PO_1)
bind message (BEGIN, statement = PO_1, portal = "")
describe message (BEGIN, portal = "")
execute message (BEGIN, portal = "")
sync message

parse message (UPDATE, statement = PO_5)
bind message (UPDATE, statement = PO_5, portal = "")
describe message (UPDATE, portal = "")
execute message (UPDATE, portal = "")
sync message
:
:
parse message (END, statement = PO_10, portal = "")
bind message (END, statement = PO_10, portal = "")
describe message (END, portal = "")
execute message (END, portal = "")
sync message

[#2 transaction]

bind message (BEGIN, statement = PO_1 portal = "")
describe message (BEGIN, portal = "")
execute message (BEGIN, portal = "")
sync message

bind message (UPDATE, statement = PO_5, portal = "")
describe message (UPDATE, portal = "")
execute message (UPDATE, portal = "")
sync message
:
:
bind message (END, statement = PO_10, portal = "")
describe message (END, portal = "")
execute message (END, portal = "")
sync message

As you can see, with "-M prepared" we can save one parse message for
each command step. This is an advantage to use named statements.

>> Patch attached.
> 
> Patch applies cleanly, compiles, doc generation ok, global & local
> tests are ok.
> 
> I'm fine having a more precise wording.
> 
> Maybe I would have also insisted on the fact that there is an explicit
> vs an implicit PREPARE, if it relies on a server-side cache. The

Not sure what you mean. There's no PREPARE in extended queries (SQL
PREPARE does exits of course). Probably you mean "parse" message in
extended queires? If so, both "-M extended" and "-M prepared" use
parse messages.

> "extended"
> documentation entry does not say that it is prepared.
> 
> I created an entry in the CF and marked the patch as ready anyway.

Thanks.

BTW, as you can see, each command step above has "sync" message. This
is pretty annoying because it hurts performance a lot, i.e. every time
sync is received PostgreSQL needs to return all results at this
point. Extended query is designed to issue only once per command set
(parse, bind, describe and execute). 

This is not a fault of pgbench, rather of libpq (the sync message is
issued inside libpq). This is a serious problem because libpq can be
used by other language APIs as well, and those languages are also
affected by the slowness of libpq. Probably we should redesign (or
add) better APIs for extended queries someday.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: pgbench doc fix

From

Robert Haas

Date:

31 October 2018, 17:49:02

On Tue, Oct 30, 2018 at 8:48 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Yes, you need to send params (thus send bind message) anyway.
> Regarding re-parsing, maybe you mixed up parse-analythis with
> planning? Re-parse-analythis can only be avoided if you can reuse
> named (or unnamed) parepared statements.

So given this, I'm struggling to see anything wrong with the current
wording.  I mean, if you say that you are reusing prepared statements,
someone will assume that you are avoiding preparing them repeatedly,
which -M extended will not do ... and by the nature of that approach,
cannot do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgbench doc fix

From

Fabien COELHO

Date:

02 November 2018, 07:35:29

Robert,

>> Yes, you need to send params (thus send bind message) anyway.
>> Regarding re-parsing, maybe you mixed up parse-analythis with
>> planning? Re-parse-analythis can only be avoided if you can reuse
>> named (or unnamed) parepared statements.
>
> So given this, I'm struggling to see anything wrong with the current
> wording.

ISTM that the point is not that it is wrong, but it could be more precise.

> I mean, if you say that you are reusing prepared statements,

It does not say "reuse" explicitely, it says

      "prepared: use extended query protocol with prepared statements."

but the extended protocol does always "prepare" statements before 
executing them, the difference are that with "-M prepared" (1) it is done 
just once and (2) named so that it can be indeed reused.

Note that "extended" prepares much more statements than "prepared":-)

> someone will assume that you are avoiding preparing them repeatedly,
> which -M extended will not do ... and by the nature of that approach, 
> cannot do.

Sure. At the protocol level "prepare" is slightly imprecise, and the 
documentation is about the protocol used.

So I do not think a more precise wording harms. Maybe: "prepared: use 
extended query protocol with REUSED named prepared statements" would be 
even less slightly ambiguous.

-- 
Fabien.

Re: pgbench doc fix

From

Tatsuo Ishii

Date:

03 November 2018, 00:08:25

> So I do not think a more precise wording harms. Maybe: "prepared: use
> extended query protocol with REUSED named prepared statements" would
> be even less slightly ambiguous.

I like this. But maybe we can remove "named"?

"prepared: use extended query protocol with reused prepared statements"

Because "named" prepared statements can be (unlike unnamed prepared
statements) reused repeatably, it implies "reused". So using both
"named" and "reused" sounds a little bit redundant to me. If we choose
one of them, I prefer "reused" since it more explicitly stats the
difference between "-M extended" and "-M prepared".

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: pgbench doc fix

From

Peter Eisentraut

Date:

03 November 2018, 09:37:00

On 03/11/2018 01:08, Tatsuo Ishii wrote:
> I like this. But maybe we can remove "named"?
> 
> "prepared: use extended query protocol with reused prepared statements"

I don't think this mouthful is useful in the --help output.  The
existing wording gets the message across just fine, I think.  More
details can be put in the reference page.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pgbench doc fix

From

Fabien COELHO

Date:

03 November 2018, 10:16:30

>> "prepared: use extended query protocol with reused prepared statements"
>
> I don't think this mouthful is useful in the --help output.  The
> existing wording gets the message across just fine, I think.  More
> details can be put in the reference page.

These suggestions are for the online doc page, and possibly an internal 
comment in the code.

The pgbench --help just states the 3 possible values and the say which is 
the default, and indeed cannot say much more.

-- 
Fabien.

Re: pgbench doc fix

From

Dmitry Dolgov

Date:

30 November 2018, 14:42:34

> On Sat, Nov 3, 2018 at 1:08 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>
> > So I do not think a more precise wording harms. Maybe: "prepared: use
> > extended query protocol with REUSED named prepared statements" would
> > be even less slightly ambiguous.
>
> I like this. But maybe we can remove "named"?

I also think it makes sense to adjust wording a bit here, and this version
sounds good (taking into account the commentary about "named"). I'm moving this
to the next CF, where the question would be if anyone from commiters can agree
with this point.

Re: pgbench doc fix

From

Peter Eisentraut

Date:

30 November 2018, 18:33:26

On 30/11/2018 15:42, Dmitry Dolgov wrote:
>> On Sat, Nov 3, 2018 at 1:08 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>>
>>> So I do not think a more precise wording harms. Maybe: "prepared: use
>>> extended query protocol with REUSED named prepared statements" would
>>> be even less slightly ambiguous.
>>
>> I like this. But maybe we can remove "named"?
> 
> I also think it makes sense to adjust wording a bit here, and this version
> sounds good (taking into account the commentary about "named"). I'm moving this
> to the next CF, where the question would be if anyone from commiters can agree
> with this point.

I don't see a concrete proposed patch here after the discussion.

Reading the documentation again, we could go for much more detail here.
For example, what's the point of having -M simple vs -M extended?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pgbench doc fix

From

Fabien COELHO

Date:

30 November 2018, 20:04:11

>>>> So I do not think a more precise wording harms. Maybe: "prepared: use
>>>> extended query protocol with REUSED named prepared statements" would
>>>> be even less slightly ambiguous.
>>>
>>> I like this. But maybe we can remove "named"?
>>
>> I also think it makes sense to adjust wording a bit here, and this version
>> sounds good (taking into account the commentary about "named"). I'm moving this
>> to the next CF, where the question would be if anyone from commiters can agree
>> with this point.
>
> I don't see a concrete proposed patch here after the discussion.
>
> Reading the documentation again, we could go for much more detail here.
> For example, what's the point of having -M simple vs -M extended?

They do not use the same libpq-level approach (PQsendQuery vs 
PQsendQueryParams), so they are not exercising the same type of client? 
Pgbench is also about testing libpq performance.

-- 
Fabien.

Re: pgbench doc fix

From

Tatsuo Ishii

Date:

03 December 2018, 05:33:15

>>>>> So I do not think a more precise wording harms. Maybe: "prepared: use
>>>>> extended query protocol with REUSED named prepared statements" would
>>>>> be even less slightly ambiguous.
>>>>
>>>> I like this. But maybe we can remove "named"?
>>>
>>> I also think it makes sense to adjust wording a bit here, and this
>>> version
>>> sounds good (taking into account the commentary about "named"). I'm
>>> moving this
>>> to the next CF, where the question would be if anyone from commiters
>>> can agree
>>> with this point.
>>
>> I don't see a concrete proposed patch here after the discussion.
>>
>> Reading the documentation again, we could go for much more detail
>> here.
>> For example, what's the point of having -M simple vs -M extended?
> 
> They do not use the same libpq-level approach (PQsendQuery vs
> PQsendQueryParams), so they are not exercising the same type of
> client? Pgbench is also about testing libpq performance.

Yes, -M extended is pretty slow because for each query it needs to
send parse/bind/describe/execute messages.  -M prepared is much faster
because for the second and subsequent iterations of query, it does not
need to execute parse analysis, which means not only less message are
exchanged but parse analysis are omitted for the second and subsequent
query iterations.

Here are quick test results using pgbench -S -M $mode -c 10 -T 30,
where $mode is "simple", "extended" or "prepared" on my Ubuntu 18
laptop. The TPS numbers are average on each 3 trials for "execuding
connections establishing".

simpe:
48804.8383 TPS

extended:
39735.4278 TPS

prepared:
83459.2293 TPS

So "prepared" is roughly 2x faster than "extended".

Based on this, I would suggest to modify exiting descriptions in
pgbench doc regarding "-M querymode":

From:
-------------------------------------------------------------------
-M querymode
--protocol=querymode

    Protocol to use for submitting queries to the server:

        simple: use simple query protocol.

        extended: use extended query protocol.

        prepared: use extended query protocol with prepared statements.
-------------------------------------------------------------------

To:
-------------------------------------------------------------------
-M querymode
--protocol=querymode

    Protocol to use for submitting queries to the server:

        simple: use simple query protocol.

        extended: use extended query protocol.

        prepared: use extended query protocol with prepared statements.

    Because in "prepared" mode pgbench reuses the parse analysis
    result for the second and subsequent query iteration, pgbench runs
    faster in the prepared mode than in other modes.
-------------------------------------------------------------------

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..c2a2ff9707 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -473,6 +473,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
             <para><literal>prepared</literal>: use extended query protocol with prepared statements.</para>
            </listitem>
           </itemizedlist>
+
+        Because in "prepared" mode <application>pgbench</application> reuses
+        the parse analysis result for the second and subsequent query
+        iteration, <application>pgbench</application> runs faster in the
+        prepared mode than in other modes.
+       </para>
+       <para>
         The default is simple query protocol.  (See <xref linkend="protocol"/>
         for more information.)
        </para>

Re: pgbench doc fix

From

Alvaro Herrera

Date:

02 January 2019, 19:36:17

On 2018-Dec-03, Tatsuo Ishii wrote:

> To:
> -------------------------------------------------------------------
> -M querymode
> --protocol=querymode
> 
>     Protocol to use for submitting queries to the server:
> 
>         simple: use simple query protocol.
> 
>         extended: use extended query protocol.
> 
>         prepared: use extended query protocol with prepared statements.
> 
>     Because in "prepared" mode pgbench reuses the parse analysis
>     result for the second and subsequent query iteration, pgbench runs
>     faster in the prepared mode than in other modes.

Looks good to me.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pgbench doc fix

From

Tatsuo Ishii

Date:

16 January 2019, 05:20:16

> On 2018-Dec-03, Tatsuo Ishii wrote:
> 
>> To:
>> -------------------------------------------------------------------
>> -M querymode
>> --protocol=querymode
>> 
>>     Protocol to use for submitting queries to the server:
>> 
>>         simple: use simple query protocol.
>> 
>>         extended: use extended query protocol.
>> 
>>         prepared: use extended query protocol with prepared statements.
>> 
>>     Because in "prepared" mode pgbench reuses the parse analysis
>>     result for the second and subsequent query iteration, pgbench runs
>>     faster in the prepared mode than in other modes.
> 
> Looks good to me.

Thanks. I'm going to commit this if there's no objection.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: pgbench doc fix

From

Tatsuo Ishii

Date:

17 January 2019, 06:48:55

>> On 2018-Dec-03, Tatsuo Ishii wrote:
>> 
>>> To:
>>> -------------------------------------------------------------------
>>> -M querymode
>>> --protocol=querymode
>>> 
>>>     Protocol to use for submitting queries to the server:
>>> 
>>>         simple: use simple query protocol.
>>> 
>>>         extended: use extended query protocol.
>>> 
>>>         prepared: use extended query protocol with prepared statements.
>>> 
>>>     Because in "prepared" mode pgbench reuses the parse analysis
>>>     result for the second and subsequent query iteration, pgbench runs
>>>     faster in the prepared mode than in other modes.
>> 
>> Looks good to me.
> 
> Thanks. I'm going to commit this if there's no objection.

Done. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp