Re: recovery is stuck when children are not processing SIGQUIT from previous crash - Mailing list pgsql-admin

From Peter Eisentraut
Subject Re: recovery is stuck when children are not processing SIGQUIT from previous crash
Date
Msg-id 1253893585.26523.15.camel@fsopti579.F-Secure.com
Whole thread Raw
In response to Re: recovery is stuck when children are not processing SIGQUIT from previous crash  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: recovery is stuck when children are not processing SIGQUIT from previous crash
Re: recovery is stuck when children are not processing SIGQUIT from previous crash
List pgsql-admin
On Wed, 2009-09-23 at 10:04 -0400, Tom Lane wrote:
> I'd prefer not to go there, at least not without a demonstration that
> this will solve a bug that's unsolvable otherwise.  If a child is
> really stuck in a state that doesn't accept SIGQUIT, it probably
> won't accept SIGKILL either (eg, uninterruptable disk wait).  Or maybe
> we just have some errant code that is blocking SIGQUIT; but that's
> a garden variety bug IMO, not something that needs major new postmaster
> logic to work around.

strace on the backend processes all showed them waiting at

futex(0x7f1ee5e21c90, FUTEX_WAIT_PRIVATE, 2, NULL

Notably, the first argument was the same for all of them.

I gather that a futex is a Linux kernel thing, which is probably then
used by glibc to implement some pthreads stuff.  Anyone know more?

But yes, using SIGKILL on these processes works without problem.


pgsql-admin by date:

Previous
From: Mihail Nasedkin
Date:
Subject: Re: pg_toast record in table pg_class
Next
From: Jakub Gołębiewski
Date:
Subject: postgresql ldap integration