Re: [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash
Date
Msg-id 1260366654.8753.2.camel@fsopti579.F-Secure.com
Whole thread Raw
List pgsql-hackers
[moved to -hackers]

On tor, 2009-11-12 at 09:35 -0500, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> >>> strace on the backend processes all showed them waiting at
> >>> futex(0x7f1ee5e21c90, FUTEX_WAIT_PRIVATE, 2, NULL
> >>> Notably, the first argument was the same for all of them.
> 
> > Looks like a race condition or lockup in the syslog code.
> 
> Hm, why are there two <signal handler> calls in the stack?
> The only thing I can think of is that we sent SIGQUIT twice.
> That's probably bad --- is there any obvious path through
> the postmaster that would do that?
> 
> The other thought is that quickdie should block signals before
> starting to do anything.

Right.  This would actually already work because a signal is blocked
while its handler runs, except that we start quickdie() with

PG_SETMASK(&BlockSig);

which blocks everything except SIGQUIT.  That should probably be fixed
in any case.



pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: [patch] pg_ctl init extension
Next
From: "Ing. Marcos Ortiz Valmaseda"
Date:
Subject: Re: What happened to pl/proxy and FDW?