Re: Backends stunk in wait event IPC/MessageQueueInternal - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Backends stunk in wait event IPC/MessageQueueInternal
Date
Msg-id CA+hUKGKGqjM1H8T7fNqmKUgmifDDyPEHRT7FdpxFLVOMyOKa0g@mail.gmail.com
Whole thread Raw
In response to Re: Backends stunk in wait event IPC/MessageQueueInternal  (Japin Li <japinli@hotmail.com>)
Responses Re: Backends stunk in wait event IPC/MessageQueueInternal
List pgsql-hackers
On Mon, May 16, 2022 at 3:45 PM Japin Li <japinli@hotmail.com> wrote:
> Maybe use the __illumos__ macro more accurity.
>
> +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \
> +       !defined(__sun__)

Thanks, updated, and with a new commit message.

I don't know much about these OSes (though I used lots of Sun machines
during the Jurassic period).  I know that there are three
distributions of illumos: OmniOS, SmartOS and OpenIndiana, and they
share the same kernel and base system.  The off-list reports I
received about hangs and kernel panics were from OpenIndiana animals
hake and haddock, which are not currently reporting (I'll ask why),
and then their owner defined -DWAIT_USE_POLL to clear that up while we
waited for progress on his kernel panic bug report.  I see that OmniOS
animal pollock is currently reporting and also uses -DWAIT_USE_POLL,
but I couldn't find any discussion about that.

Of course, you might be hitting some completely different problem,
given the lack of information.  I'd be interested in the output of "p
*MyLatch" (= to see if the latch has already been set), and whether
"kill -URG PID" dislodges the stuck process.  But given the open
kernel bug report that I've now been reminded of, I'm thinking about
pushing this anyway.  Then we could ask the animal owners to remove
-DWAIT_USE_POLL so that they'd effectively be running with
-DWAIT_USE_EPOLL and -DWAIT_USE_SELF_PIPE, which would be more like
PostgreSQL 13, but people who want to reproduce the problem on the
illumos side could build with -DWAIT_USE_SIGNALFD.

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: bogus: logical replication rows/cols combinations
Next
From: Tom Lane
Date:
Subject: Re: Minor improvements to test log navigability