Unkillable Backend Processes - Mailing list pgsql-admin

From Thomas F. O'Connell
Subject Unkillable Backend Processes
Date
Msg-id 3BB218ED-FD62-42BF-8BDD-066954F961AC@sitening.com
Whole thread Raw
Responses Re: Unkillable Backend Processes
Re: Unkillable Backend Processes
List pgsql-admin
I've encountered an oddity on a postgres cluster that results in an
unresponsive postmaster and, frequently, unkillable backend
processes. I'm having a difficult time isolating the queries that are
related to this scenario because by the time the scenario occurs,
max_connections have been reached, and no superuser connections are
available. Because the query doesn't finish, I don't think it's
getting logged (since logging is only done at the query level on a
duration or error basis). In the current iteration, I can tell that
it's an INSERT that's causing the problem, and the INSERT is coming
from an Apache process on a machine on the same network. In recent
occurrences, though, I'm almost positive I've seen a SELECT.

But as troubled as I am by the cause, I'm similarly troubled by my
inability to treat the symptoms effectively. When this occurs, I have
tried shutting down the pgpools and postmaster (using pg_ctl).
Unfortunately, pgpool frequently hangs during the shutdown attempt.
When I kill these off individually using kill and then shut down the
postmaster with pg_ctl immediate mode, I will occasionally find a
backend process that cannot be killed, even with a KILL (-9) signal.

Is this likely to be caused by something at a lower level than postgres?

Here are the specs:

PostgreSQL 8.1.3
pgpool 3.0.1
Debian GNU/Linux 3.1
Linux 2.6.10 #8 SMP
system: ext3 RAID 1
WAL: jfs RAID 10
data: jfs RAID 10

There's also an NFS mount point.

I'm still trying to do the forensics on the root cause (a related
oddity: the system can run in production for days or weeks without
any issues), but I'm just as interested in why I can't kill postgres
backend processes that have no postmaster. If I can provide more
information related to recovery, please let me know.

--
Thomas F. O'Connell
Database Architecture and Programming
Sitening, LLC

http://www.sitening.com/
3004 B Poston Avenue
Nashville, TN 37203-1314
615-260-0005 (cell)
615-469-5150 (office)
615-469-5151 (fax)


pgsql-admin by date:

Previous
From: Chris Browne
Date:
Subject: Re: Synchronize Backup to another remote database
Next
From: Alvaro Herrera
Date:
Subject: Re: Unkillable Backend Processes