Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > On first glance, I don't see anything dangerous about SIGTERM.
>
> You haven't thought about it very hard :-(
Yea, that's why I said "on first glance".
> The major difference I see is that elog(FATAL) will call proc_exit
> directly from elog, rather than longjmp'ing back to PostgresMain.
> The case that we have confidence in involves elog(ERROR) returning to
> PostgresMain and then calling proc_exit from there (in the path where
> we get EOF from the client).
>
> This leaves me with a couple of concerns:
>
> * Notice all that cleanup/reset stuff in the "if (sigsetjmp())" block
> in PostgresMain. SIGTERM will cause proc_exit to be entered without
> any of that being done first. Does it work reliably? Shouldn't this be
> refactored to ensure the same things happen in both cases?
>
> * There are various places, especially in the PLs, that try to hook into
> error recovery by manipulating Warn_restart. Will any of them have
> problems if their error recovery code doesn't get called during SIGTERM
> exit?
>
> One possible refactoring is for elog(FATAL) to go ahead and longjmp back
> to PostgresMain, and at the end of the error recovery block check a flag
> and do proc_exit() if we're fataling. However I am not sure that this
> doesn't break the design for coping with elog's during proc_exit.
>
> Alvaro's nested-transaction work is another thing that's got to be
> thought about before touching this code. I have not yet seen any design
> for error recovery in the nested xact case, but I am sure it's going to
> need some changes right around here.
OK, the attached patch refactors the elog(FATAL)/SIGTERM exit to behave
just like a EOF from the client, with the exception of sending a proper
exit code.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/tcop/postgres.c,v
retrieving revision 1.398
diff -c -c -r1.398 postgres.c
*** src/backend/tcop/postgres.c 7 Apr 2004 05:05:49 -0000 1.398
--- src/backend/tcop/postgres.c 8 Apr 2004 18:59:33 -0000
***************
*** 2938,2944 ****
/*
* (3) read a command (loop blocks here)
*/
! firstchar = ReadCommand(&input_message);
/*
* (4) disable async signal conditions again.
--- 2938,2947 ----
/*
* (3) read a command (loop blocks here)
*/
! if (!in_fatal_exit)
! firstchar = ReadCommand(&input_message);
! else
! firstchar = EOF;
/*
* (4) disable async signal conditions again.
***************
*** 3170,3176 ****
* Otherwise it will fail to be called during other
* backend-shutdown scenarios.
*/
! proc_exit(0);
case 'd': /* copy data */
case 'c': /* copy done */
--- 3173,3180 ----
* Otherwise it will fail to be called during other
* backend-shutdown scenarios.
*/
! proc_exit(!in_fatal_exit ? 0 : proc_exit_inprogress ||
! !IsUnderPostmaster);
case 'd': /* copy data */
case 'c': /* copy done */
Index: src/backend/utils/error/elog.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/error/elog.c,v
retrieving revision 1.132
diff -c -c -r1.132 elog.c
*** src/backend/utils/error/elog.c 5 Apr 2004 03:02:06 -0000 1.132
--- src/backend/utils/error/elog.c 8 Apr 2004 18:59:34 -0000
***************
*** 72,77 ****
--- 72,79 ----
char *Log_line_prefix = NULL; /* format for extra log line info */
unsigned int Log_destination;
+ bool in_fatal_exit = false;
+
#ifdef HAVE_SYSLOG
char *Syslog_facility; /* openlog() parameters */
char *Syslog_ident;
***************
*** 442,448 ****
*/
fflush(stdout);
fflush(stderr);
! proc_exit(proc_exit_inprogress || !IsUnderPostmaster);
}
/*
--- 444,453 ----
*/
fflush(stdout);
fflush(stderr);
!
! if (in_fatal_exit)
! ereport(PANIC, (errmsg("fatal error during fatal exit, giving up")));
! in_fatal_exit = true;
}
/*
Index: src/include/tcop/tcopprot.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/tcop/tcopprot.h,v
retrieving revision 1.64
diff -c -c -r1.64 tcopprot.h
*** src/include/tcop/tcopprot.h 7 Apr 2004 05:05:50 -0000 1.64
--- src/include/tcop/tcopprot.h 8 Apr 2004 18:59:36 -0000
***************
*** 34,39 ****
--- 34,40 ----
extern DLLIMPORT const char *debug_query_string;
extern char *rendezvous_name;
extern int max_stack_depth;
+ extern bool in_fatal_exit;
/* GUC-configurable parameters */