recovery is stuck when children are not processing SIGQUIT from previous crash - Mailing list pgsql-admin

From Peter Eisentraut
Subject recovery is stuck when children are not processing SIGQUIT from previous crash
Date
Msg-id 1253704891.20834.8.camel@fsopti579.F-Secure.com
Whole thread Raw
Responses Re: recovery is stuck when children are not processing SIGQUIT from previous crash
List pgsql-admin
I have observed the following situation a few times now (weeks or months
apart), most recently with 8.3.7.  Some postgres child process crashes.
The postmaster notices and sends SIGQUIT to all other children.  Once
all other children have exited, it would enter recovery.  But for some
reason, some children are not processing the SIGQUIT signal and are
basically just stuck.  That means the whole database system is then
stuck and won't continue without manual intervention.  If I go in
manually and SIGKILL the offending processes, everything proceeds
normally, recovery finishes, and the system is up again.

I haven't had the chance yet to analyze why the SIGQUIT signals are
getting stuck.  Be that as it may, it appears there are no provisions
for this case.  I couldn't find any documentation or previous reports on
this sort of thing.  One might imagine a feature where the postmaster
resorts to throwing SIGKILLs around after a while, similar to how init
scripts are sometimes set up.  But perhaps manual intervention is the
way to go.

Comments?


pgsql-admin by date:

Previous
From: nalini
Date:
Subject: Re: Recover postgres database
Next
From: Rafael Domiciano
Date:
Subject: Authentication Postgres user via LDAP