Thread: pgbench - minor fix for meta command only scripts

pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

09 July 2016, 07:09:20

While testing meta-command pgbench only scripts, I noticed that there is
an infinite loop in threadRun, which means that other tasks such as
reporting progress do not get a chance.

The attached patch breaks this loop by always returning at the end of a
script.

On "pgbench -T 3 -P 1 -f noop.sql", before this patch, the progress is not
shown, after it is.

--
Fabien.

Attachment

Re: pgbench - minor fix for meta command only scripts

From

Michael Paquier

Date:

11 July 2016, 04:41:08

On Sat, Jul 9, 2016 at 4:09 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
> While testing meta-command pgbench only scripts, I noticed that there is an
> infinite loop in threadRun, which means that other tasks such as reporting
> progress do not get a chance.
>
> The attached patch breaks this loop by always returning at the end of a
> script.
>
> On "pgbench -T 3 -P 1 -f noop.sql", before this patch, the progress is not
> shown, after it is.

You may want to name your patches with .patch or .diff. Using .sql is
disturbing style :)

Indeed, not reporting the progress back to the client in the case of a
script with only meta commands is non-intuitive.

-       /* after a meta command, immediately proceed with next command */
-       goto top;
+       /*
+        * After a meta command, immediately proceed with next command...
+        * although not if last. This exception ensures that a meta command
+        * only script does not always loop in doCustom, so that other tasks
+        * in threadRun, eg progress reporting or switching client,
get a chance.
+        */
+       if (commands[st->state + 1] != NULL)
+           goto top;

This looks good to me. I'd just rewrite the comment block with
something like that, more simplified:
+       /*
+        * After a meta command, immediately proceed with next command.
+        * But if this is the last command, just leave.
+        */
-- 
Michael

Re: pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

11 July 2016, 05:10:24

Hello Michaël,

> You may want to name your patches with .patch or .diff. Using .sql is
> disturbing style :)

Indeed! :-)

> Indeed, not reporting the progress back to the client in the case of a
> script with only meta commands is non-intuitive.
>
> This looks good to me. I'd just rewrite the comment block with
> something like that, more simplified:

Ok. Here is an updated version, with a better suffix and a simplified
comment.

Thanks,

--
Fabien.

Attachment

pgbench-no-sql-fix-2.patch

Re: pgbench - minor fix for meta command only scripts

From

Tom Lane

Date:

12 July 2016, 20:59:34

Fabien COELHO <coelho@cri.ensmp.fr> writes:
> Ok. Here is an updated version, with a better suffix and a simplified 
> comment.

Doesn't this break the handling of latency calculations, or at least make
the results completely different for the last metacommand than what they
would be for a non-last command?  It looks like it needs to loop back so
that the latency calculation is completed for the metacommand before it
can exit.  Seems to me it would probably make more sense to fall out at
the end of the "transaction finished" if-block, around line 1923 in HEAD.

(The code structure in here seems like a complete mess to me, but probably
now is not the time to refactor it.)
        regards, tom lane

Re: pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

13 July 2016, 08:14:49

Hello Tom,

>> Ok. Here is an updated version, with a better suffix and a simplified
>> comment.
>
> Doesn't this break the handling of latency calculations, or at least make
> the results completely different for the last metacommand than what they
> would be for a non-last command?  It looks like it needs to loop back so
> that the latency calculation is completed for the metacommand before it
> can exit.  Seems to me it would probably make more sense to fall out at
> the end of the "transaction finished" if-block, around line 1923 in HEAD.

Indeed, it would trouble a little bit the stats computation by delaying
the recording of the end of statement & transaction.

However line 1923 is a shortcut for ending pgbench, but at the end of a
transaction more stuff must be done, eg choosing the next script and
reconnecting, before exiting. The solution is more contrived.

The attached patch provides a solution which ensures the return in the
right condition and after the stat collection. The code structure requires
another ugly boolean to proceed so as to preserve doing the reconnection
between the decision that the return must be done and the place where it
can be done, after reconnecting.

> (The code structure in here seems like a complete mess to me, but probably
> now is not the time to refactor it.)

I fully agree that the code structure is a total mess:-( Maybe I'll try to
submit a simpler one some day.

Basically the doCustom function is not resilient, you cannot exit from
anywhere and hope that re-entring would achieve a consistent behavior.

While reading the code to find a better place for a return, I noted some
possible inconsistencies in recording stats, which are noted as comments
in the attached patch.

Calling chooseScript is done both from outside for initialization and from
inside doCustom, where it could be done once and more clearly in doCustom.

Boolean listen is not reset because the script is expected to execute
directly the start of the next statement. I succeeded in convincing myself
that it actually works, but it is unobvious to spot why. I think that a
simpler pattern would be welcome. Also, some other things (eg prepared)
are not reset in all cases, not sure why.

The goto should probably be replaced by a while.

...

--
Fabien.

Attachment

pgbench-latency-t-2.patch

Re: pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

13 July 2016, 08:21:15

> The attached patch provides a solution which ensures the return in the right
> condition and after the stat collection. The code structure requires another
> ugly boolean to proceed so as to preserve doing the reconnection between the
> decision that the return must be done and the place where it can be done,
> after reconnecting.

Ooops, the attached patched was the right content but wrongly named:-(

Here it is again with a consistent name.

Sorry for the noise.

--
Fabien.

Attachment

pgbench-no-sql-fix-3.patch

Re: pgbench - minor fix for meta command only scripts

From

Heikki Linnakangas

Date:

19 September 2016, 18:48:31

On 07/13/2016 11:14 AM, Fabien COELHO wrote:
>> (The code structure in here seems like a complete mess to me, but probably
>> now is not the time to refactor it.)
>
> I fully agree that the code structure is a total mess:-( Maybe I'll try to
> submit a simpler one some day.
>
> Basically the doCustom function is not resilient, you cannot exit from
> anywhere and hope that re-entring would achieve a consistent behavior.
>
> While reading the code to find a better place for a return, I noted some
> possible inconsistencies in recording stats, which are noted as comments
> in the attached patch.
>
> Calling chooseScript is done both from outside for initialization and from
> inside doCustom, where it could be done once and more clearly in doCustom.
>
> Boolean listen is not reset because the script is expected to execute
> directly the start of the next statement. I succeeded in convincing myself
> that it actually works, but it is unobvious to spot why. I think that a
> simpler pattern would be welcome. Also, some other things (eg prepared)
> are not reset in all cases, not sure why.
>
> The goto should probably be replaced by a while.
>
> ...

Yeah, it really is quite a mess. I tried to review your patch, and I
think it's correct, but I couldn't totally convince myself, because of
the existing messiness of the logic. So I bit the bullet and started
refactoring.

I came up with the attached. It refactors the logic in doCustom() into a
state machine. I think this is much clearer, what do you think?

> @@ -1892,6 +1895,7 @@ top:
>              /*
>               * Read and discard the query result; note this is not included in
>               * the statement latency numbers.
> +             * Should this be done before recording the statement stats?
>               */
>              res = PQgetResult(st->con);
>              switch (PQresultStatus(res))

Well, the comment right there says "note this is not included in the
statement latency numbers", so apparently it's intentional. Whether it's
a good idea or not, I don't know :-). It does seem a bit surprising.

But what seems more bogus to me is that we do that after recording the
*transaction* stats, if this was the last command. So the PQgetResult()
of the last command in the transaction is not included in the
transaction stats, even though the PQgetResult() calls for any previous
commands are. (Perhaps that's what you meant too?)

I changed that in my patch, it would've been inconvenient to keep that
old behavior, and it doesn't make any sense to me anyway.

- Heikki

Attachment

refactor-pgbench-doCustom.patch

Re: pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

20 September 2016, 05:10:01

Hello Heikki,

> Yeah, it really is quite a mess. I tried to review your patch, and I think 
> it's correct, but I couldn't totally convince myself, because of the existing 
> messiness of the logic.

Alas:-(

> So I bit the bullet and started refactoring.

Wow!

> I came up with the attached. It refactors the logic in doCustom() into a 
> state machine.

Sounds good! This can only help.

> I think this is much clearer, what do you think?

I think that something was really needed. I'm going to review and test 
this patch very carefully, probably over next week-end, and report.

-- 
Fabien.

Re: pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

24 September 2016, 09:45:31

Hello Heikki,

> Yeah, it really is quite a mess. I tried to review your patch, and I think
> it's correct, but I couldn't totally convince myself, because of the existing
> messiness of the logic. So I bit the bullet and started refactoring.
>
> I came up with the attached. It refactors the logic in doCustom() into a
> state machine. I think this is much clearer, what do you think?

The patch did not apply to master because of you committed the sleep fix
in between. I updated the patch so that the fix is included as well.

I think that this is really needed. The code is much clearer and simple to
understand with the state machines & additional functions. This is a
definite improvement to the code base.

I've done quite some testing with various options (-r, --rate,
--latency-limit, -C...) and got pretty reasonnable results.

Although I cannot be absolutely sure that the refactoring does not
introduce any new bug, I'm convinced that it will be much easier to find
them:-)


Attached are some small changes to your version:

I have added the sleep_until fix.

I have fixed a bug introduced in the patch by changing && by || in the
(min_sec > 0 && maxsock != -1) condition which was inducing errors with
multi-threads & clients...

I have factored out several error messages in "commandFailed", in place of
the "metaCommandFailed", and added the script number as well in the error
messages. All messages are now specific to the failed command.

I have added two states to the machine:

  - CSTATE_CHOOSE_SCRIPT which simplifies threadRun, there is now one call
    to chooseScript instead of two before.

  - CSTATE_END_COMMAND which manages is_latencies and proceeding to the
    next command, thus merging the three instances of updating the stats
    that were in the first version.

The later state means that processing query results is included in the per
statement latency, which is an improvement because before I was getting
some transaction latency significantly larger that the apparent sum of the
per-statement latencies, which did not make much sense...

I have added & updated a few comments. There are some places where the
break could be a pass through instead, not sure how desirable it is, I'm
fine with break.


> Well, the comment right there says "note this is not included in the
> statement latency numbers", so apparently it's intentional. Whether it's a
> good idea or not, I don't know :-). It does seem a bit surprising.

Indeed, it also results in apparently inconsistent numbers, and it creates
a mess for recording the statement latency because it meant that in some
case the latency was collected before the actual end of the command, see
the discussion about CSTATE_END_COMMAND above.

> But what seems more bogus to me is that we do that after recording the
> *transaction* stats, if this was the last command. So the PQgetResult() of
> the last command in the transaction is not included in the transaction stats,
> even though the PQgetResult() calls for any previous commands are. (Perhaps
> that's what you meant too?)
>
> I changed that in my patch, it would've been inconvenient to keep that old
> behavior, and it doesn't make any sense to me anyway.

Fine with me.

--
Fabien.

Attachment

pgbench-refactor-2.patch

Re: pgbench - minor fix for meta command only scripts

From

Heikki Linnakangas

Date:

26 September 2016, 08:01:09

On 09/24/2016 12:45 PM, Fabien COELHO wrote:
> Although I cannot be absolutely sure that the refactoring does not
> introduce any new bug, I'm convinced that it will be much easier to find
> them:-)

:-)

> Attached are some small changes to your version:
>
> I have added the sleep_until fix.
>
> I have fixed a bug introduced in the patch by changing && by || in the
> (min_sec > 0 && maxsock != -1) condition which was inducing errors with
> multi-threads & clients...
>
> I have factored out several error messages in "commandFailed", in place of
> the "metaCommandFailed", and added the script number as well in the error
> messages. All messages are now specific to the failed command.
>
> I have added two states to the machine:
>
>   - CSTATE_CHOOSE_SCRIPT which simplifies threadRun, there is now one call
>     to chooseScript instead of two before.
>
>   - CSTATE_END_COMMAND which manages is_latencies and proceeding to the
>     next command, thus merging the three instances of updating the stats
>     that were in the first version.
>
> The later state means that processing query results is included in the per
> statement latency, which is an improvement because before I was getting
> some transaction latency significantly larger that the apparent sum of the
> per-statement latencies, which did not make much sense...

Ok. I agree that makes more sense.

> I have added & updated a few comments.

Thanks! Committed.

> There are some places where the break could be a pass through
> instead, not sure how desirable it is, I'm fine with break.

I left them as "break". Pass-throughs are error-prone, and make it more 
difficult to read, IMHO. The compiler will optimize it into a 
pass-through anyway, if possible and worthwhile, so there should be no 
performance difference.

- Heikki

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Jeff Janes

Date:

04 September 2017, 23:39:10

On Mon, Sep 26, 2016 at 1:01 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 09/24/2016 12:45 PM, Fabien COELHO wrote:

Attached are some small changes to your version:

I have added the sleep_until fix.

I have fixed a bug introduced in the patch by changing && by || in the
(min_sec > 0 && maxsock != -1) condition which was inducing errors with
multi-threads & clients...

I have factored out several error messages in "commandFailed", in place of
the "metaCommandFailed", and added the script number as well in the error
messages. All messages are now specific to the failed command.

I have added two states to the machine:

- CSTATE_CHOOSE_SCRIPT which simplifies threadRun, there is now one call
to chooseScript instead of two before.

- CSTATE_END_COMMAND which manages is_latencies and proceeding to the
next command, thus merging the three instances of updating the stats
that were in the first version.

The later state means that processing query results is included in the per
statement latency, which is an improvement because before I was getting
some transaction latency significantly larger that the apparent sum of the
per-statement latencies, which did not make much sense...

Ok. I agree that makes more sense.

I have added & updated a few comments.

Thanks! Committed.

There are some places where the break could be a pass through
instead, not sure how desirable it is, I'm fine with break.

I left them as "break". Pass-throughs are error-prone, and make it more difficult to read, IMHO. The compiler will optimize it into a pass-through anyway, if possible and worthwhile, so there should be no performance difference.

Since this commit (12788ae49e1933f463bc5), if I use the --rate to throttle the transaction rate, it does get throttled to about the indicated speed, but the pg_bench consumes the entire CPU.

At the block of code starting

if (min_usec > 0 && maxsock != -1)

If maxsock == -1, then there is no sleep happening.

Cheers,

Jeff

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

04 September 2017, 23:56:32

Hello Jeff,

>>> I have fixed a bug introduced in the patch by changing && by || in the
>>> (min_sec > 0 && maxsock != -1) condition which was inducing errors with
>>> multi-threads & clients...

> Since this commit (12788ae49e1933f463bc5), if I use the --rate to throttle
> the transaction rate, it does get throttled to about the indicated speed,
> but the pg_bench consumes the entire CPU.
>
>
> At the block of code starting
>        if (min_usec > 0 && maxsock != -1)
>
> If maxsock == -1, then there is no sleep happening.

Argh, shame on me:-(

I cannot find the "induced errors" I was refering to in the message... 
Sleeping is definitely needed to avoid a hard loop.

Patch attached fixes it and does not seem introduce any special issue...

Should probably be backpatched.

Thanks for the debug!

-- 
Fabien.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgbench-rate-bug-1.patch

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Jeff Janes

Date:

05 September 2017, 00:21:08

On Mon, Sep 4, 2017 at 1:56 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Jeff,

I have fixed a bug introduced in the patch by changing && by || in the
(min_sec > 0 && maxsock != -1) condition which was inducing errors with
multi-threads & clients...

Since this commit (12788ae49e1933f463bc5), if I use the --rate to throttle
the transaction rate, it does get throttled to about the indicated speed,
but the pg_bench consumes the entire CPU.

At the block of code starting
if (min_usec > 0 && maxsock != -1)

If maxsock == -1, then there is no sleep happening.

Argh, shame on me:-(

I cannot find the "induced errors" I was refering to in the message... Sleeping is definitely needed to avoid a hard loop.

Patch attached fixes it and does not seem introduce any special issue...

Should probably be backpatched.

Thanks for the debug!

Thanks Fabien, that works for me.

But if min_sec <= 0, do we want to do whatever it is that we already know is over-do, before stopping to do the select? If it is safe to go through this code path when maxsock == -1, then should we just change it to this?

if (min_usec > 0)

Cheers,

Jeff

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

11 September 2017, 11:49:42

Hello Jeff,

Ok, the problem was a little bit more trivial than I thought.

The issue is that under a low rate there may be no transaction in 
progress, however the wait procedure was relying on select's timeout. If 
nothing is active there is nothing to wait for, thus it was an active loop 
in this case...

I've introduced a usleep call in place of select for this particular 
case. Hopefully this is portable.

ISTM that this bug exists since rate was introduced, so shame on me and 
back-patching should be needed.

-- 
Fabien.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgbench-rate-bug-2.patch

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Jeff Janes

Date:

11 September 2017, 22:14:38

On Mon, Sep 11, 2017 at 1:49 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Jeff,

Ok, the problem was a little bit more trivial than I thought.

The issue is that under a low rate there may be no transaction in progress, however the wait procedure was relying on select's timeout. If nothing is active there is nothing to wait for, thus it was an active loop in this case...

I've introduced a usleep call in place of select for this particular case. Hopefully this is portable.

Shouldn't we use pg_usleep to ensure portability? it is defined for front-end code. But it returns void, so the error check will have to be changed.

I didn't see the problem before the commit I originally indicated , so I don't think it has to be back-patched to before v10.

Cheers,

Jeff

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

12 September 2017, 04:27:13

Hello Jeff,

> Shouldn't we use pg_usleep to ensure portability?  it is defined for
> front-end code.  But it returns void, so the error check will have to be
> changed.

Attached v3 with pg_usleep called instead.

> I didn't see the problem before the commit I originally indicated , so I
> don't think it has to be back-patched to before v10.

Hmmm.... you've got a point, although I'm not sure how it could work 
without sleeping explicitely. Maybe the path was calling select with an 
empty wait list plus timeout, and select is kind enough to just sleep on 
an empty list, or some other miracle. ISTM clearer to explicitely sleep in 
that case.

-- 
Fabien.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgbench-rate-bug-3.patch

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Noah Misch

Date:

28 September 2017, 08:23:26

On Tue, Sep 12, 2017 at 03:27:13AM +0200, Fabien COELHO wrote:
> >Shouldn't we use pg_usleep to ensure portability?  it is defined for
> >front-end code.  But it returns void, so the error check will have to be
> >changed.
> 
> Attached v3 with pg_usleep called instead.
> 
> >I didn't see the problem before the commit I originally indicated , so I
> >don't think it has to be back-patched to before v10.
> 
> Hmmm.... you've got a point, although I'm not sure how it could work without
> sleeping explicitely. Maybe the path was calling select with an empty wait
> list plus timeout, and select is kind enough to just sleep on an empty list,
> or some other miracle. ISTM clearer to explicitely sleep in that case.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Jeff Janes

Date:

29 September 2017, 19:39:43

On Mon, Sep 11, 2017 at 6:27 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Jeff,

Shouldn't we use pg_usleep to ensure portability? it is defined for
front-end code. But it returns void, so the error check will have to be
changed.

Attached v3 with pg_usleep called instead.

I didn't see the problem before the commit I originally indicated , so I
don't think it has to be back-patched to before v10.

Hmmm.... you've got a point, although I'm not sure how it could work without sleeping explicitely. Maybe the path was calling select with an empty wait list plus timeout, and select is kind enough to just sleep on an empty list, or some other miracle.

Not really a miracle, calling select with an empty list of file handles is a standard way to sleep on Unix-like platforms. (Indeed, that is how pg_usleep is implemented on non-Windows platforms, see "src/port/pgsleep.c"). The problem is that it is reportedly not portable to Windows. But I tested pgbench.exe for 9.6.5-1 from EDB installer, and I don't see excessive CPU usage for a throttled run, and it throttles to about the correct speed. So maybe the non-portability is more rumor than reality. So I don't know if this needs backpatching or not. But it should be fixed for v10, as there it becomes a demonstrably live issue.

ISTM clearer to explicitly sleep in that case.

Yes.

Cheers,

Jeff

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

29 September 2017, 20:43:48


> reality.  So I don't know if this needs backpatching or not.  But it 
> should be fixed for v10, as there it becomes a demonstrably live issue.

Yes.

-- 
Fabien.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Robert Haas

Date:

30 September 2017, 00:31:17

On Mon, Sep 11, 2017 at 4:49 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Ok, the problem was a little bit more trivial than I thought.
>
> The issue is that under a low rate there may be no transaction in progress,
> however the wait procedure was relying on select's timeout. If nothing is
> active there is nothing to wait for, thus it was an active loop in this
> case...
>
> I've introduced a usleep call in place of select for this particular case.
> Hopefully this is portable.
>
> ISTM that this bug exists since rate was introduced, so shame on me and
> back-patching should be needed.

I took a look at this and found that the proposed patch applies
cleanly all the way back to 9.5, but the regression is reported to
have begun with a commit that starts in v10.  I haven't probed into
this in any depth, but are we sure that
12788ae49e1933f463bc59a6efe46c4a01701b76 is in fact where this problem
originated?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Fabien COELHO

Date:

30 September 2017, 01:21:49

Hello Robert,

>> ISTM that this bug exists since rate was introduced, so shame on me and
>> back-patching should be needed.
>
> I took a look at this and found that the proposed patch applies
> cleanly all the way back to 9.5, but the regression is reported to
> have begun with a commit that starts in v10.  I haven't probed into
> this in any depth, but are we sure that
> 12788ae49e1933f463bc59a6efe46c4a01701b76 is in fact where this problem
> originated?

Yes.

I just rechecked that the problem occurs at 12788ae but not at the 
preceding da6c4f6ca8.

Now the situation before the restructuring is that it worked but given the 
spaghetti code it was very hard to guess why, not to fix issues when 
not...

My late at night fuzzy interpretation is as follows:

The issue is in the code above the fix I submitted which checks what has 
to be selected. In the previous version ISTM that the condition was laxed, 
so it filled the input_mask even if the client was not waiting for 
anything, so it was calling select later which was really just 
implementing the timeout. With the updated version the input mask and 
maxsock is only set if there is really something to wait, and if not then
it fall through and is active instead of doing a simple sleep/timeout.

So I would say that the previous version worked because of a side effect 
which may or may not have been intentional at the time, and was revealed 
by checking the condition better.

Basically I'd say that the restructuring patch fixed a defect which 
triggered the bug. Programming is fun:-)

-- 
Fabien.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - minor fix for meta command only scripts

From

Heikki Linnakangas

Date:

01 October 2017, 09:38:34

On 09/29/2017 08:43 PM, Fabien COELHO wrote:
>> reality.  So I don't know if this needs backpatching or not.  But it
>> should be fixed for v10, as there it becomes a demonstrably live issue.
> 
> Yes.

Patch looks good to me, so committed to master and v10. Thanks!

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers