On Thu, Feb 8, 2018 at 7:06 AM, David Kohn <djk447@gmail.com> wrote: > I'm not clear on when we do a SetLatch on those message queues during a > cancel of parallel workers, and a number of other things that could > definitely invalidate this analysis, but I think there could be a plausible > explanation in there somewhere.
shm_mq_detach_internal() does SetLatch(&victim->procLatch) ("victim" being the counterparty process) after setting mq_detached. So ideally no one should ever be able to wait forever on a queue from which the other end has detached, but perhaps there is some race condition style bug lurking in here. I'm going to do some testing and see if I can break this...
> Thanks for the help on this, I hope this is helpful and do let me know if a > stacktrace or anything else would be helpful on my end.