Thread: shm_mq fix for non-blocking mode

shm_mq fix for non-blocking mode

From
Robert Haas
Date:
The shm_mq code handles blocking mode and non-blocking mode
asymmetrically in a couple of places, with the unfortunate result that
if you are using non-blocking mode, and your counterparty dies before
attaching the queue, operations on the queue continue to return
SHM_MQ_WOULD_BLOCK instead of, as they should, returning
SHM_MQ_DETACHED.  The attached patch fixes the problem.  Thanks to my
colleague Rushabh Lathia for helping track this down.

(There's are some further bugs in this area outside the shm_mq code
... but I'm still trying to figure out exactly what they are and what
we should do about them.  This much, however, seems clear-cut.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: shm_mq fix for non-blocking mode

From
Robert Haas
Date:
On Fri, Oct 16, 2015 at 5:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> The shm_mq code handles blocking mode and non-blocking mode
> asymmetrically in a couple of places, with the unfortunate result that
> if you are using non-blocking mode, and your counterparty dies before
> attaching the queue, operations on the queue continue to return
> SHM_MQ_WOULD_BLOCK instead of, as they should, returning
> SHM_MQ_DETACHED.  The attached patch fixes the problem.  Thanks to my
> colleague Rushabh Lathia for helping track this down.
>
> (There's are some further bugs in this area outside the shm_mq code
> ... but I'm still trying to figure out exactly what they are and what
> we should do about them.  This much, however, seems clear-cut.)

...and so I've committed it and back-patched to 9.4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: shm_mq fix for non-blocking mode

From
Robert Haas
Date:
On Thu, Oct 22, 2015 at 4:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Oct 16, 2015 at 5:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> The shm_mq code handles blocking mode and non-blocking mode
>> asymmetrically in a couple of places, with the unfortunate result that
>> if you are using non-blocking mode, and your counterparty dies before
>> attaching the queue, operations on the queue continue to return
>> SHM_MQ_WOULD_BLOCK instead of, as they should, returning
>> SHM_MQ_DETACHED.  The attached patch fixes the problem.  Thanks to my
>> colleague Rushabh Lathia for helping track this down.
>>
>> (There's are some further bugs in this area outside the shm_mq code
>> ... but I'm still trying to figure out exactly what they are and what
>> we should do about them.  This much, however, seems clear-cut.)
>
> ...and so I've committed it and back-patched to 9.4.

Sigh.  This was buggy; I have no idea how it survived my earlier testing.

I will go fix it.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: shm_mq fix for non-blocking mode

From
Robert Haas
Date:
On Thu, Oct 22, 2015 at 10:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> ...and so I've committed it and back-patched to 9.4.
>
> Sigh.  This was buggy; I have no idea how it survived my earlier testing.
>
> I will go fix it.  Sorry.

Gah!  That, too, turned out to be buggy, although in a considerably
more subtle way.  I've pushed another fix with a detailed comment and
an explanatory commit message that hopefully squashes this problem for
good.  Combined with the fix at
http://www.postgresql.org/message-id/CA+TgmoZzv3u9trsvcAO+-OtXbsz_u+A5Q8X-_B+VZceHhtzTmA@mail.gmail.com
this seems to squash occasional complaints about workers "dying
unexpectedly" when they really had done no such thing.

The test code I used to find these problems is attached.  I compiled
and installed the parallel_dummy extension, did pgbench -i -s 100, and
then ran this:

while psql -c "select parallel_count('pgbench_accounts', 4)"; do sleep 1; done

Without these fixes, this can hang or error out, but with these fixes,
it works fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment