Re: kill -KILL: What happens? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: kill -KILL: What happens?
Date
Msg-id 18314.1294933288@sss.pgh.pa.us
Whole thread Raw
In response to kill -KILL: What happens?  (David Fetter <david@fetter.org>)
Responses Re: kill -KILL: What happens?  (David Fetter <david@fetter.org>)
List pgsql-hackers
David Fetter <david@fetter.org> writes:
> I've noticed over the years that we give people dire warnings never to
> send a KILL signal to the postmaster, but I'm unsure as to what are
> potential consequences of this, as in just exactly how this can result
> in problems.  Is there some reference I can look to for explanations
> of the mechanism(s) whereby the damage occurs?

There's no risk of data corruption, if that's what you're thinking of.
It's just that you're then looking at having to manually clean up the
child processes and then restart the postmaster; a process that is not
only tedious but does offer the possibility of screwing yourself.

In particular the risk is that someone clueless enough to do this would
next decide that removing $PGDATA/postmaster.pid, rather than killing
all the existing children, is the quickest way to get the postmaster
restarted.  Once he's done that, his data will shortly be hosed beyond
recovery, because now he has two noncommunicating sets of backends
massaging the same files via separate sets of shared buffers.

The reason this sequence of events doesn't seem improbable is that the
error you get when you try to start a new postmaster, if there are still
old backends running, is

FATAL:  pre-existing shared memory block (key 5490001, ID 15609) is still in use
HINT:  If you're sure there are no old server processes still running, remove the shared memory block or just delete
thefile "postmaster.pid".
 

Maybe we should rewrite that HINT --- while it's *possible* that
removing the shmem block or deleting postmaster.pid is the right thing
to do, it's not exactly *likely*.  I think we need to put a bit more
emphasis on the "If ..." part.  Like "If you are prepared to swear on
your mother's grave that there are no old server processes still
running, consider removing postmaster.pid.  But first check for existing
processes again."

(BTW, I notice that this interlock against starting a new postmaster
appears to be broken in HEAD, which is likely not unrelated to the fact
that the contents of postmaster.pid seem to be totally bollixed :-()
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: SSI patch version 8
Next
From: Magnus Hagander
Date:
Subject: Re: system views for walsender activity