On Thu, Feb 8, 2018 at 7:06 AM, David Kohn <djk447@gmail.com> wrote:
> I'm not clear on when we do a SetLatch on those message queues during a
> cancel of parallel workers, and a number of other things that could
> definitely invalidate this analysis, but I think there could be a plausible
> explanation in there somewhere.
shm_mq_detach_internal() does SetLatch(&victim->procLatch) ("victim"
being the counterparty process) after setting mq_detached. So ideally
no one should ever be able to wait forever on a queue from which the
other end has detached, but perhaps there is some race condition style
bug lurking in here. I'm going to do some testing and see if I can
break this...
> Thanks for the help on this, I hope this is helpful and do let me know if a
> stacktrace or anything else would be helpful on my end.
Yeah stack traces would be great, if you can.
--
Thomas Munro
http://www.enterprisedb.com