Hi Fabien,
On Fri, Mar 15, 2019 at 4:17 PM, Fabien COELHO wrote:
> >> echo 'select 1' > select.sql
> >>
> >> while /bin/true; do
> >> pgbench -n -f select.sql -R 1000 -j 8 -c 8 -T 1 > /dev/null 2>&1;
> >> date;
> >> done;
> >
> > Indeed. I'll look at it over the weekend.
> >
> >> So I guess this is a bug in 12788ae49e1933f463bc59a6efe46c4a01701b76, or
> >> one of the other commits touching this part of the code.
>
> I could not reproduce this issue on head, but I confirm on 11.2.
I could reproduce the stuck on 11.4.
On Sat, Mar 16, 2019 at 10:14 AM, Fabien COELHO wrote:
> Attached is a fix to apply on pg11.
I confirm the stuck doesn't happen after applying your patch.
It passes make check-world.
This change seems not to affect performance, so I didn't do any performance
test.
> + /* under throttling we may have finished the last client above */
> + if (remains == 0)
> + break;
If there are only CSTATE_WAIT_RESULT, CSTATE_SLEEP or CSTATE_THROTTLE clients,
a thread needs to wait the results or sleep. In that logic, there are the case
that a thread tried to wait the results when there are no clients wait the
results, and this causes the issue. This is happened when there are only
CSTATE_THROTLE clients and pgbench timeout is occured. Those clients will be
finished and "remains" will be 0.
I confirmed above codes prevent such a case.
I almost think this is ready for committer, but I have one question.
Is it better adding any check like if(maxsock != -1) before the select?
else /* no explicit delay, select without timeout */
{
nsocks = select(maxsock + 1, &input_mask, NULL, NULL, NULL);
}
--
Yoshikazu Imai