Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED - Mailing list pgsql-hackers

From Robert Haas
Subject Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED
Date
Msg-id CA+TgmoY3SF4d3tPa5wix1O4X-9viJ+fGz0ZpJX1=jYCW_2TQOg@mail.gmail.com
Whole thread Raw
In response to shm_mq inconsistent behavior of SHM_MQ_DETACHED  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED  (Petr Jelinek <petr@2ndquadrant.com>)
List pgsql-hackers
On Tue, Apr 22, 2014 at 9:55 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> I was playing with shm_mq and found a little odd behavior with detaching
> after sending messages.
>
> Following sequence behaves as expected (receiver gets 2 messages):
> P1 -> set_sender
> P1 -> attach
> P2 -> set_receiver
> P2 -> attach
> P1 -> send
> P2 -> receive
> P1 -> send
> P1 -> detach
> P2 -> receive
> P2 -> detach
>
> But if I do first receive after detach like in this sequence:
> P1 -> set_sender
> P1 -> attach
> P2 -> set_receiver
> P2 -> attach
> P1 -> send
> P1 -> send
> P1 -> detach
> P2 -> receive
>
> I get SHM_MQ_DETACHED on the receiver even though there are messages in the
> ring buffer.

That's a bug.

> The reason for this behavior is that mqh_counterparty_attached is only set
> by shm_mq_receive. This does not seem to be consistent - I would either
> expect to get SHM_MQ_DETACHED always when other party has detached or always
> get all remaining messages that are in queue (and I would strongly prefer
> the latter).
>
> Maybe the shm_mq_get_bytes_written should be used to determine if there is
> something left for us to read in the receiver if we hit the
> !mqh_counterparty_attached code path with detached sender?

That's probably not a good idea, because there could be just a partial
message left in the buffer, if the sender died midway through writing
it.  I suspect that attacking the problem that way will lead to a
bunch of nasty edge cases.

I'm thinking that the problem is really revolves around
shm_mq_wait_internal().  It returns true if the queue is attached but
not detached, and false if either the detach has already happened, or
if we establish via the background worker handle that it will never
come.  But in the case of receiving, we want to treat
attached-then-detached as a success case, not a failure case.

Can you see if the attached patch fixes it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Next
From: Tom Lane
Date:
Subject: Re: Different behaviour of concate() and concate operator ||