Thread: pgbench: using prepared BEGIN statement in a pipeline could cause an error

pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

16 July 2021, 06:30:13

Hello,

I found that using "BEGIN ISOLATINO LEVEL SERIALIZABLE" in a pipline with
prepared statement makes pgbench abort.

 $ cat pipeline.sql 
 \startpipeline
 begin isolation level repeatable read;
 select 1;
 end;
 \endpipeline

 $ pgbench -f pipeline.sql -M prepared -t 1
 pgbench (15devel)
 starting vacuum...end.
 pgbench: error: client 0 script 0 aborted in command 4 query 0: 
 transaction type: pipeline.sql
 scaling factor: 1
 query mode: prepared
 number of clients: 1
 number of threads: 1
 number of transactions per client: 1
 number of transactions actually processed: 0/1
 pgbench: fatal: Run was aborted; the above results are incomplete.

The error that occured in the backend was
"ERROR:  SET TRANSACTION ISOLATION LEVEL must be called before any query".

After investigating this, now I've got the cause as below. 

1. The commands in the script are executed in the order. First, pipeline
   mode starts at \startpipeline.
2. Parse messages for all SQL commands in the script are sent to the backend
   because it is first time to execute them.
3. An implicit transaction starts, and this is not committed yet because Sync
   message is not sent at that time in pipeline mode.
4. All prepared statements are sent to the backend.
5. After processing \endpipeline, Sync is issued and all sent commands are
   executed.
6. However, the BEGIN doesn't start new transaction because the implicit
   transaction has already started.  The error above occurs because the snapshot
   was already created before the BEGIN command.

We can also see the similar error when using "BEGIN DEFERRABLE". 

One way to avoid these errors is to send Parse messages before pipeline mode
starts. I attached a patch to fix to prepare commands at starting of a script
instead of at the first execution of the command. 

Or, we can also avoid these errors by placing \startpipeline after the BEGIN, 
so it might be enogh just to note in the documentation. 

Actually, we also get an error just when there is another SQL command before the
BEGIN in a pipelne, as below, regardless to using prepared statement or not,
because this command cause an implicit transaction.

 \startpipeline
 select 0;
 begin isolation level repeatable read;
 select 1;
 end;
 \endpipeline

I think it is hard to prevent this error from pgbench without analysing command
strings. Therefore, noting  in the documentation that the first command in a pipeline
starts an implicit transaction might be useful for users.


What do you think?


Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Fabien COELHO

Date:

17 July 2021, 05:03:01

Hello Yugo-san,

> [...] One way to avoid these errors is to send Parse messages before 
> pipeline mode starts. I attached a patch to fix to prepare commands at 
> starting of a script instead of at the first execution of the command.

> What do you think?

ISTM that moving prepare out of command execution is a good idea, so I'm 
in favor of this approach: the code is simpler and cleaner.

ISTM that a minor impact is that the preparation is not counted in the 
command performance statistics. I do not think that it is a problem, even 
if it would change detailed results under -C -r -M prepared.

Patch applies & compiles cleanly, global & local make check ok. However 
the issue is not tested. I think that the patch should add a tap test case 
for the problem being addressed.

I'd suggest to move the statement preparation call in the 
CSTATE_CHOOSE_SCRIPT case.

In comments: not yet -> needed.

-- 
Fabien.

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

19 July 2021, 01:51:36

Hello Fabien,

On Sat, 17 Jul 2021 07:03:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:

> 
> Hello Yugo-san,
> 
> > [...] One way to avoid these errors is to send Parse messages before 
> > pipeline mode starts. I attached a patch to fix to prepare commands at 
> > starting of a script instead of at the first execution of the command.
> 
> > What do you think?
> 
> ISTM that moving prepare out of command execution is a good idea, so I'm 
> in favor of this approach: the code is simpler and cleaner.
> 
> ISTM that a minor impact is that the preparation is not counted in the 
> command performance statistics. I do not think that it is a problem, even 
> if it would change detailed results under -C -r -M prepared.

I agree with you. Currently, whether prepares are sent in pipeline mode or
not depends on whether the first SQL command is placed between \startpipeline
and \endpipeline regardless whether other commands are executed in pipeline
or not. ISTM, this behavior would be not intuitive for users.  Therefore, 
I think preparing all statements not using pipeline mode is not problem for now.

If some users would like to send prepares in  pipeline, I think it would be
better to provide more simple and direct way. For example, we prepare statements
in pipeline if the user use an option, or if the script has at least one
\startpipeline in their pipeline. Maybe,  --pipeline option is useful for users
who want to use pipeline mode for all queries in scirpts including built-in ones.
However, these features seems to be out of the patch proposed in this thread.

> Patch applies & compiles cleanly, global & local make check ok. However 
> the issue is not tested. I think that the patch should add a tap test case 
> for the problem being addressed.

Ok. I'll add a tap test to confirm the error I found is avoidable.

> I'd suggest to move the statement preparation call in the 
> CSTATE_CHOOSE_SCRIPT case.

I thought so at first, but I noticed we cannot do it at least if we are
using -C because the connection may not be established in the
CSTATE_CHOOSE_SCRIPT state. 

> In comments: not yet -> needed.

Thanks. I'll fix it.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

21 July 2021, 01:49:09

On Mon, 19 Jul 2021 10:51:36 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:

> Hello Fabien,
> 
> On Sat, 17 Jul 2021 07:03:01 +0200 (CEST)
> Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> 
> > 
> > Hello Yugo-san,
> > 
> > > [...] One way to avoid these errors is to send Parse messages before 
> > > pipeline mode starts. I attached a patch to fix to prepare commands at 
> > > starting of a script instead of at the first execution of the command.
> > 
> > > What do you think?
> > 
> > ISTM that moving prepare out of command execution is a good idea, so I'm 
> > in favor of this approach: the code is simpler and cleaner.
> > 
> > ISTM that a minor impact is that the preparation is not counted in the 
> > command performance statistics. I do not think that it is a problem, even 
> > if it would change detailed results under -C -r -M prepared.
> 
> I agree with you. Currently, whether prepares are sent in pipeline mode or
> not depends on whether the first SQL command is placed between \startpipeline
> and \endpipeline regardless whether other commands are executed in pipeline
> or not. ISTM, this behavior would be not intuitive for users.  Therefore, 
> I think preparing all statements not using pipeline mode is not problem for now.
> 
> If some users would like to send prepares in  pipeline, I think it would be
> better to provide more simple and direct way. For example, we prepare statements
> in pipeline if the user use an option, or if the script has at least one
> \startpipeline in their pipeline. Maybe,  --pipeline option is useful for users
> who want to use pipeline mode for all queries in scirpts including built-in ones.
> However, these features seems to be out of the patch proposed in this thread.
> 
> > Patch applies & compiles cleanly, global & local make check ok. However 
> > the issue is not tested. I think that the patch should add a tap test case 
> > for the problem being addressed.
> 
> Ok. I'll add a tap test to confirm the error I found is avoidable.
> 
> > I'd suggest to move the statement preparation call in the 
> > CSTATE_CHOOSE_SCRIPT case.
> 
> I thought so at first, but I noticed we cannot do it at least if we are
> using -C because the connection may not be established in the
> CSTATE_CHOOSE_SCRIPT state. 
> 
> > In comments: not yet -> needed.
> 
> Thanks. I'll fix it.

I attached the updated patch v2, which includes a comment fix and a TAP test.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

v2-pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Daniel Gustafsson

Date:

15 November 2021, 13:13:32

> On 21 Jul 2021, at 03:49, Yugo NAGATA <nagata@sraoss.co.jp> wrote:

> I attached the updated patch v2, which includes a comment fix and a TAP test.

This patch fails the TAP test for pgbench:

  # Tests were run but no plan was declared and done_testing() was not seen.
  # Looks like your test exited with 25 just after 224.
  t/001_pgbench_with_server.pl ..
  Dubious, test returned 25 (wstat 6400, 0x1900)
  All 224 subtests passed
  t/002_pgbench_no_server.pl .... ok
  Test Summary Report
  -------------------
  t/001_pgbench_with_server.pl (Wstat: 6400 Tests: 224 Failed: 0)
  Non-zero exit status: 25
  Parse errors: No plan found in TAP output
  Files=2, Tests=426, 3 wallclock secs ( 0.04 usr 0.00 sys + 1.20 cusr 0.36 csys = 1.60 CPU)
  Result: FAIL

--
Daniel Gustafsson        https://vmware.com/

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

15 November 2021, 17:26:43

Hello Daniel Gustafsson,

On Mon, 15 Nov 2021 14:13:32 +0100
Daniel Gustafsson <daniel@yesql.se> wrote:

> > On 21 Jul 2021, at 03:49, Yugo NAGATA <nagata@sraoss.co.jp> wrote:
> 
> > I attached the updated patch v2, which includes a comment fix and a TAP test.
> 
> This patch fails the TAP test for pgbench:

Thank you for pointing it out!
I attached the updated patch.

Regards,
Yugo Nagata

>   # Tests were run but no plan was declared and done_testing() was not seen.
>   # Looks like your test exited with 25 just after 224.
>   t/001_pgbench_with_server.pl ..
>   Dubious, test returned 25 (wstat 6400, 0x1900)
>   All 224 subtests passed
>   t/002_pgbench_no_server.pl .... ok
>   Test Summary Report
>   -------------------
>   t/001_pgbench_with_server.pl (Wstat: 6400 Tests: 224 Failed: 0)
>   Non-zero exit status: 25
>   Parse errors: No plan found in TAP output
>   Files=2, Tests=426, 3 wallclock secs ( 0.04 usr 0.00 sys + 1.20 cusr 0.36 csys = 1.60 CPU)
>   Result: FAIL
> 
> --
> Daniel Gustafsson        https://vmware.com/
> 


-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

v3-pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Kyotaro Horiguchi

Date:

27 January 2022, 08:50:25

At Tue, 16 Nov 2021 02:26:43 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in 
> Thank you for pointing it out!
> I attached the updated patch.

I think we want more elabolative comment for the new place of
preparing as you mentioned in the first mail.

At Fri, 16 Jul 2021 15:30:13 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in 
> One way to avoid these errors is to send Parse messages before
> pipeline mode starts.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

01 March 2022, 06:55:59

Hi Horiguchi-san,

On Thu, 27 Jan 2022 17:50:25 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

> At Tue, 16 Nov 2021 02:26:43 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in 
> > Thank you for pointing it out!
> > I attached the updated patch.
> 
> I think we want more elabolative comment for the new place of
> preparing as you mentioned in the first mail.

Thank you for your suggestion.

I added comments on the prepareCommands() call as in the updated patch.

Regards,
Yugo Nagata


Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

v4-pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Tom Lane

Date:

25 March 2022, 20:19:54

Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> [...] One way to avoid these errors is to send Parse messages before 
>> pipeline mode starts. I attached a patch to fix to prepare commands at 
>> starting of a script instead of at the first execution of the command.

> ISTM that moving prepare out of command execution is a good idea, so I'm 
> in favor of this approach: the code is simpler and cleaner.
> ISTM that a minor impact is that the preparation is not counted in the 
> command performance statistics. I do not think that it is a problem, even 
> if it would change detailed results under -C -r -M prepared.

I am not convinced this is a great idea.  The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode.  For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

BTW, the cfbot says the patch is failing to apply anyway ...
I think it was sideswiped by 4a39f87ac.

            regards, tom lane

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

28 March 2022, 03:33:16

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Fabien COELHO <coelho@cri.ensmp.fr> writes:
> >> [...] One way to avoid these errors is to send Parse messages before 
> >> pipeline mode starts. I attached a patch to fix to prepare commands at 
> >> starting of a script instead of at the first execution of the command.
> 
> > ISTM that moving prepare out of command execution is a good idea, so I'm 
> > in favor of this approach: the code is simpler and cleaner.
> > ISTM that a minor impact is that the preparation is not counted in the 
> > command performance statistics. I do not think that it is a problem, even 
> > if it would change detailed results under -C -r -M prepared.
> 
> I am not convinced this is a great idea.  The current behavior is that
> a statement is not prepared until it's about to be executed, and I think
> we chose that deliberately to avoid semantic differences between prepared
> and not-prepared mode.  For example, if a script looks like
> 
> CREATE FUNCTION foo(...) ...;
> SELECT foo(...);
> DROP FUNCTION foo;
> 
> trying to prepare the SELECT in advance would lead to failure.
>
> We could perhaps get away with preparing the commands within a pipeline
> just before we start to execute the pipeline, but it looks to me like
> this patch tries to prepare the entire script in advance.
> 
Well, the semantic differences is already in the current behavior.
Currently, pgbench fails to execute the above script in prepared mode
because it prepares the entire script in advance just before the first
command execution. This patch does not change the semantic.

> BTW, the cfbot says the patch is failing to apply anyway ...
> I think it was sideswiped by 4a39f87ac.

I attached the rebased patch.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

v5-pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Ibrar Ahmed

Date:

03 September 2022, 05:36:37

On Mon, Mar 28, 2022 at 8:35 AM Yugo NAGATA <nagata@sraoss.co.jp> wrote:

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Fabien COELHO <coelho@cri.ensmp.fr> writes:
> >> [...] One way to avoid these errors is to send Parse messages before
> >> pipeline mode starts. I attached a patch to fix to prepare commands at
> >> starting of a script instead of at the first execution of the command.
>
> > ISTM that moving prepare out of command execution is a good idea, so I'm
> > in favor of this approach: the code is simpler and cleaner.
> > ISTM that a minor impact is that the preparation is not counted in the
> > command performance statistics. I do not think that it is a problem, even
> > if it would change detailed results under -C -r -M prepared.
>
> I am not convinced this is a great idea. The current behavior is that
> a statement is not prepared until it's about to be executed, and I think
> we chose that deliberately to avoid semantic differences between prepared
> and not-prepared mode. For example, if a script looks like
>
> CREATE FUNCTION foo(...) ...;
> SELECT foo(...);
> DROP FUNCTION foo;
>
> trying to prepare the SELECT in advance would lead to failure.
>
> We could perhaps get away with preparing the commands within a pipeline
> just before we start to execute the pipeline, but it looks to me like
> this patch tries to prepare the entire script in advance.
>
Well, the semantic differences is already in the current behavior.
Currently, pgbench fails to execute the above script in prepared mode
because it prepares the entire script in advance just before the first
command execution. This patch does not change the semantic.

> BTW, the cfbot says the patch is failing to apply anyway ...
> I think it was sideswiped by 4a39f87ac.

I attached the rebased patch.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,

Since you haven't had time to write a review the last many days, the author replied
with a rebased patch for a long time and never heard. We've taken your name
off the reviewer list for this patch. Of course, you are still welcome to review it if you can
find the time. We're removing your name so that other reviewers know the patch still needs
attention. We understand that day jobs and other things get in the way of doing patch
reviews when you want to, so please come back and review a patch or two later when you
have more time.

Ibrar Ahmed

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Julien Rouhaud

Date:

03 September 2022, 07:09:03

Hi,

On Sat, Sep 03, 2022 at 10:36:37AM +0500, Ibrar Ahmed wrote:
>
> Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,
>
> Since you haven't had time to write a review the last many days, the author
> replied
> with a rebased patch for a long time and never heard. We've taken your name
> off the reviewer list for this patch. Of course, you are still welcome to
> review it if you can
> find the time. We're removing your name so that other reviewers know the
> patch still needs
> attention. We understand that day jobs and other things get in the way of
> doing patch
> reviews when you want to, so please come back and review a patch or two
> later when you
> have more time.

I thought that we decided not to remove assigned reviewers from a CF entry,
even if they didn't reply recently?  See the discussion around
https://www.postgresql.org/message-id/CA%2BTgmoZSBNhX0zCkG5T5KiQize9Aq4%2Bec%2BuqLcfBhm_%2B12MbQA%40mail.gmail.com

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Ibrar Ahmed

Date:

04 September 2022, 07:56:58

On Sat, Sep 3, 2022 at 12:09 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Sat, Sep 03, 2022 at 10:36:37AM +0500, Ibrar Ahmed wrote:
>
> Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,
>
> Since you haven't had time to write a review the last many days, the author
> replied
> with a rebased patch for a long time and never heard. We've taken your name
> off the reviewer list for this patch. Of course, you are still welcome to
> review it if you can
> find the time. We're removing your name so that other reviewers know the
> patch still needs
> attention. We understand that day jobs and other things get in the way of
> doing patch
> reviews when you want to, so please come back and review a patch or two
> later when you
> have more time.

I thought that we decided not to remove assigned reviewers from a CF entry,
even if they didn't reply recently? See the discussion around
https://www.postgresql.org/message-id/CA%2BTgmoZSBNhX0zCkG5T5KiQize9Aq4%2Bec%2BuqLcfBhm_%2B12MbQA%40mail.gmail.com

Ah, ok, thanks for the clarification. I will add them back.

@Jacob Champion, we need to update the CommitFest Checklist [1] document accordingly.

"Reviewer Clear
[reviewer name]:

Since you haven't had time to write a review of [patch] in the last 5 days,
we've taken your name off the reviewer list for this patch."

[1] https://wiki.postgresql.org/wiki/CommitFest_Checklist

Ibrar Ahmed

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

12 September 2022, 15:03:43

On 2022-Mar-28, Yugo NAGATA wrote:

> On Fri, 25 Mar 2022 16:19:54 -0400
> Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > I am not convinced this is a great idea.  The current behavior is that
> > a statement is not prepared until it's about to be executed, and I think
> > we chose that deliberately to avoid semantic differences between prepared
> > and not-prepared mode.  For example, if a script looks like
> > 
> > CREATE FUNCTION foo(...) ...;
> > SELECT foo(...);
> > DROP FUNCTION foo;
> > 
> > trying to prepare the SELECT in advance would lead to failure.
> >
> > We could perhaps get away with preparing the commands within a pipeline
> > just before we start to execute the pipeline, but it looks to me like
> > this patch tries to prepare the entire script in advance.

Maybe it would work to have one extra boolean in struct Command, indicating
that the i-th command in the script is inside a pipeline; in -M
prepared, issue PREPARE for each command marked with that flag ahead of
time, and for all other commands, do as today.  That way, we don't
change behavior for anything except those commands that need the change.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can."   (Ken Rockwell)

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Yugo NAGATA

Date:

30 September 2022, 01:07:43

Hi,

On Mon, 12 Sep 2022 17:03:43 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2022-Mar-28, Yugo NAGATA wrote:
> 
> > On Fri, 25 Mar 2022 16:19:54 -0400
> > Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > > I am not convinced this is a great idea.  The current behavior is that
> > > a statement is not prepared until it's about to be executed, and I think
> > > we chose that deliberately to avoid semantic differences between prepared
> > > and not-prepared mode.  For example, if a script looks like
> > > 
> > > CREATE FUNCTION foo(...) ...;
> > > SELECT foo(...);
> > > DROP FUNCTION foo;
> > > 
> > > trying to prepare the SELECT in advance would lead to failure.
> > >
> > > We could perhaps get away with preparing the commands within a pipeline
> > > just before we start to execute the pipeline, but it looks to me like
> > > this patch tries to prepare the entire script in advance.
> 
> Maybe it would work to have one extra boolean in struct Command, indicating
> that the i-th command in the script is inside a pipeline; in -M
> prepared, issue PREPARE for each command marked with that flag ahead of
> time, and for all other commands, do as today.  That way, we don't
> change behavior for anything except those commands that need the change.

Well, I still don't understand why we need to prepare only "the
commands within a pipeline" before starting pipeline.  In the current
behavior,  the entire script is prepared in advance just before executing
the first SQL command in the script, instead of preparing each command
one by one. This patch also prepare the entire script in advance, so
there is no behavioural change in this sense.

However, there are a few behavioural changes. One is that the preparation
is not counted in the command performance statistics as Fabien mentioned.
Another is that all meta-commands including \shell and \sleep etc. are
executed before the preparation.

To reduce impact of these changes, I updated the patch to prepare the
commands just before executing the first SQL command or \startpipeline
meta-command instead of at the beginning of the script. 

Regards,
Yugo Nagata

> 
> -- 
> Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
> "Digital and video cameras have this adjustment and film cameras don't for the
> same reason dogs and cats lick themselves: because they can."   (Ken Rockwell)

-- 
Yugo NAGATA <nagata@sraoss.co.jp>

Attachment

v6-pgbench-prepare.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

06 February 2023, 10:11:32

I'm writing my own patch for this problem.  While playing around with
it, I noticed this:

struct Command {
    PQExpBufferData            lines;                /*     0    24 */
    char *                     first_line;           /*    24     8 */
    int                        type;                 /*    32     4 */
    MetaCommand                meta;                 /*    36     4 */
    int                        argc;                 /*    40     4 */

    /* XXX 4 bytes hole, try to pack */

    char *                     argv[256];            /*    48  2048 */
    /* --- cacheline 32 boundary (2048 bytes) was 48 bytes ago --- */
    char *                     varprefix;            /*  2096     8 */
    PgBenchExpr *              expr;                 /*  2104     8 */
    /* --- cacheline 33 boundary (2112 bytes) --- */
    SimpleStats                stats;                /*  2112    40 */
    int64                      retries;              /*  2152     8 */
    int64                      failures;             /*  2160     8 */

    /* size: 2168, cachelines: 34, members: 11 */
    /* sum members: 2164, holes: 1, sum holes: 4 */
    /* last cacheline: 56 bytes */
};

Not great.  I suppose this makes pgbench slower than it needs to be.
Can we do better?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Tom Lane

Date:

06 February 2023, 15:23:01

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> I'm writing my own patch for this problem.  While playing around with
> it, I noticed this:

> struct Command {
>     /* size: 2168, cachelines: 34, members: 11 */
>     /* sum members: 2164, holes: 1, sum holes: 4 */
>     /* last cacheline: 56 bytes */
> };

I think the original intent was for argv[] to be at the end,
which fell victim to ye olde add-at-the-end antipattern.
Cache-friendliness-wise, putting it back to the end would
likely be enough.  But turning it into a variable-size array
would be better from a functionality standpoint.

            regards, tom lane

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

08 February 2023, 12:09:28

On 2022-Sep-30, Yugo NAGATA wrote:

> Well, I still don't understand why we need to prepare only "the
> commands within a pipeline" before starting pipeline.  In the current
> behavior,  the entire script is prepared in advance just before executing
> the first SQL command in the script, instead of preparing each command
> one by one. This patch also prepare the entire script in advance, so
> there is no behavioural change in this sense.
> 
> However, there are a few behavioural changes. One is that the preparation
> is not counted in the command performance statistics as Fabien mentioned.
> Another is that all meta-commands including \shell and \sleep etc. are
> executed before the preparation.
> 
> To reduce impact of these changes, I updated the patch to prepare the
> commands just before executing the first SQL command or \startpipeline
> meta-command instead of at the beginning of the script. 

I propose instead the following: each command is prepared just before
it's executed, as previously, and if we see a \startpipeline, then we
prepare all commands starting with the one just after, and until the
\endpipeline.

I didn't test additional cases other than the one you submitted.

Testing this I noticed that pg_log_debug et al don't support
multithreading very well -- the lines are interspersed.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

08 February 2023, 12:10:40

On 2023-Feb-08, Alvaro Herrera wrote:

> I propose instead the following: each command is prepared just before
> it's executed, as previously, and if we see a \startpipeline, then we
> prepare all commands starting with the one just after, and until the
> \endpipeline.

Here's the patch.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

Attachment

v7-0001-Prepare-commands-in-pipelines-in-advance.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Andres Freund

Date:

14 February 2023, 00:20:04

Hi,

On 2023-02-08 13:10:40 +0100, Alvaro Herrera wrote:
> On 2023-Feb-08, Alvaro Herrera wrote:
> 
> > I propose instead the following: each command is prepared just before
> > it's executed, as previously, and if we see a \startpipeline, then we
> > prepare all commands starting with the one just after, and until the
> > \endpipeline.
> 
> Here's the patch.

There's something wrong with the patch, it reliably fails with core dumps:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3260

Example crash:
https://api.cirrus-ci.com/v1/task/4922406553255936/logs/cores.log

https://api.cirrus-ci.com/v1/artifact/task/6611256413519872/crashlog/crashlog-pgbench.EXE_0750_2023-02-13_14-07-06-189.txt

Andres

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

17 February 2023, 20:35:12

On 2023-Feb-13, Andres Freund wrote:

> There's something wrong with the patch, it reliably fails with core dumps:
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3260

I think this would happen on machines where sizeof(bool) is not 1 (which
mine is evidently not).  Fixed.

In addition, there was the problem that the psprintf() to generate the
command name would race against each other if you had multiple threads.
I changed the code so that the name to prepare each statement under is
generated when the Command struct is first initialized, which occurs
before the threads are started.  One small issue is that now we use a
single counter for all commands of all scripts, rather than a
script-local counter.  This doesn't seem at all important.

I did realize that Nagata-san was right that we've always prepared the
whole script in advance; that behavior was there already in commit
49639a7b2c52 that introduced -Mprepared.  We've never done each command
just before executing it.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.

Attachment

v8-0001-pgbench-Prepare-commands-in-pipelines-in-advance.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

20 February 2023, 20:17:47

Found one last problem: if you do "-f foobar.sql -M prepared" in that
order, then the prepare fails because the statement names would not be
assigned when the file is parsed.  This coding only supported doing
"-M prepared -f foobar.sql", which funnily enough is the only one that
PostgreSQL/Cluster.pm->pgbench() supports.  So I moved the prepared
statement name generation to the postprocess step.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)

Attachment

v9-0001-pgbench-Prepare-commands-in-pipelines-in-advance.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

21 February 2023, 16:32:49

On 2023-Feb-20, Alvaro Herrera wrote:

> Found one last problem: if you do "-f foobar.sql -M prepared" in that
> order, then the prepare fails because the statement names would not be
> assigned when the file is parsed.  This coding only supported doing
> "-M prepared -f foobar.sql", which funnily enough is the only one that
> PostgreSQL/Cluster.pm->pgbench() supports.  So I moved the prepared
> statement name generation to the postprocess step.

Pushed to all three branches -- thanks, Nagata-san, for diagnosing the
issue.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it."         (ncm, http://lwn.net/Articles/174769/)

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alexander Lakhin

Date:

20 May 2023, 16:00:01

Hello Alvaro,

21.02.2023 19:32, Alvaro Herrera wrote:
> On 2023-Feb-20, Alvaro Herrera wrote:
>
>> Found one last problem: if you do "-f foobar.sql -M prepared" in that
>> order, then the prepare fails because the statement names would not be
>> assigned when the file is parsed.  This coding only supported doing
>> "-M prepared -f foobar.sql", which funnily enough is the only one that
>> PostgreSQL/Cluster.pm->pgbench() supports.  So I moved the prepared
>> statement name generation to the postprocess step.
> Pushed to all three branches -- thanks, Nagata-san, for diagnosing the
> issue.

Starting from 038f586d5, the following script:
echo "
\startpipeline
\endpipeline
" >test.sql
pgbench -n -M prepared -f test.sql

leads to the pgbench's segfault:
Core was generated by `pgbench -n -M prepared -f test.sql'.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/2327306' in core file too small.
#0  0x0000555a402546b4 in prepareCommandsInPipeline (st=st@entry=0x555a409d62e0) at pgbench.c:3130
3130            st->prepared[st->use_file][st->command] = true;
(gdb) bt
#0  0x0000555a402546b4 in prepareCommandsInPipeline (st=st@entry=0x555a409d62e0) at pgbench.c:3130
#1  0x0000555a40257fca in executeMetaCommand (st=st@entry=0x555a409d62e0, now=now@entry=0x7ffdd46eff58)
     at pgbench.c:4413
#2  0x0000555a402585ce in advanceConnectionState (thread=thread@entry=0x555a409d6580, st=st@entry=0x555a409d62e0,
     agg=agg@entry=0x7ffdd46f0090) at pgbench.c:3807
#3  0x0000555a40259564 in threadRun (arg=arg@entry=0x555a409d6580) at pgbench.c:7535
#4  0x0000555a4025ca40 in main (argc=<optimized out>, argv=<optimized out>) at pgbench.c:7253

Best regards,
Alexander

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

22 May 2023, 11:49:47

On 2023-May-20, Alexander Lakhin wrote:

> Starting from 038f586d5, the following script:
> echo "
> \startpipeline
> \endpipeline
> " >test.sql
> pgbench -n -M prepared -f test.sql
> 
> leads to the pgbench's segfault:

Hah, yeah, that's because an empty pipeline never calls the code to
allocate the flag array.  Here's the trivial fix.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)

Attachment

0001-Fix-pgbench-in-prepared-mode-with-empty-pipeline.patch

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

From

Alvaro Herrera

Date:

25 May 2023, 10:39:19

On 2023-May-22, Alvaro Herrera wrote:

> Hah, yeah, that's because an empty pipeline never calls the code to
> allocate the flag array.  Here's the trivial fix.

Pushed to both branches, thanks for the report.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/