Re: Function to kill backend - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Function to kill backend
Date
Msg-id 15157.1081442845@sss.pgh.pa.us
Whole thread Raw
In response to Re: Function to kill backend  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> On first glance, I don't see anything dangerous about SIGTERM.

You haven't thought about it very hard :-(

The major difference I see is that elog(FATAL) will call proc_exit
directly from elog, rather than longjmp'ing back to PostgresMain.
The case that we have confidence in involves elog(ERROR) returning to
PostgresMain and then calling proc_exit from there (in the path where
we get EOF from the client).

This leaves me with a couple of concerns:

* Notice all that cleanup/reset stuff in the "if (sigsetjmp())" block
in PostgresMain.  SIGTERM will cause proc_exit to be entered without
any of that being done first.  Does it work reliably?  Shouldn't this be
refactored to ensure the same things happen in both cases?

* There are various places, especially in the PLs, that try to hook into
error recovery by manipulating Warn_restart.  Will any of them have
problems if their error recovery code doesn't get called during SIGTERM
exit?

One possible refactoring is for elog(FATAL) to go ahead and longjmp back
to PostgresMain, and at the end of the error recovery block check a flag
and do proc_exit() if we're fataling.  However I am not sure that this
doesn't break the design for coping with elog's during proc_exit.

Alvaro's nested-transaction work is another thing that's got to be
thought about before touching this code.  I have not yet seen any design
for error recovery in the nested xact case, but I am sure it's going to
need some changes right around here.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: make == as = ?
Next
From: Andrew Hammond
Date:
Subject: Re: rotatelogs integration in pg_ctl