Re: Missed condition-variable wakeups on FreeBSD - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Missed condition-variable wakeups on FreeBSD
Date
Msg-id 20220226210625.GK9008@telsasoft.com
Whole thread Raw
In response to Missed condition-variable wakeups on FreeBSD  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Feb 26, 2022 at 02:07:05PM -0500, Tom Lane wrote:
> I don't know much about how gdb interacts with kernel calls on
> FreeBSD, but I speculate that the poll(2) call returns with EINTR
> after gdb releases the process, and then things resume fine,
> suggesting that we lost an interrupt somewhere.

I've seen some similar interactions with strace under linux causing a process
to be woken up or otherwise incur a different behavior (not necessarily
postgres).  

> Thoughts?  Ideas on debugging this?

Before attaching a debugger, figure out what syscall each process is in.
In linux, that's ps O wchan PID.

>> Besides trying to make the issue more likely as suggested above, it might be
>> worth checking if signalling the stuck processes with SIGUSR1 gets them
>> unstuck.

And SIGCONT.

Maybe already did this, but you can dump a corefile of the running processes to
allow future inspection.

-- 
Justin



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Missed condition-variable wakeups on FreeBSD
Next
From: Justin Pryzby
Date:
Subject: Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))