Thread: Re: [HACKERS] Function to kill backend

Re: [HACKERS] Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > On first glance, I don't see anything dangerous about SIGTERM.
>
> You haven't thought about it very hard :-(

Yea, that's why I said "on first glance".

> The major difference I see is that elog(FATAL) will call proc_exit
> directly from elog, rather than longjmp'ing back to PostgresMain.
> The case that we have confidence in involves elog(ERROR) returning to
> PostgresMain and then calling proc_exit from there (in the path where
> we get EOF from the client).
>
> This leaves me with a couple of concerns:
>
> * Notice all that cleanup/reset stuff in the "if (sigsetjmp())" block
> in PostgresMain.  SIGTERM will cause proc_exit to be entered without
> any of that being done first.  Does it work reliably?  Shouldn't this be
> refactored to ensure the same things happen in both cases?
>
> * There are various places, especially in the PLs, that try to hook into
> error recovery by manipulating Warn_restart.  Will any of them have
> problems if their error recovery code doesn't get called during SIGTERM
> exit?
>
> One possible refactoring is for elog(FATAL) to go ahead and longjmp back
> to PostgresMain, and at the end of the error recovery block check a flag
> and do proc_exit() if we're fataling.  However I am not sure that this
> doesn't break the design for coping with elog's during proc_exit.
>
> Alvaro's nested-transaction work is another thing that's got to be
> thought about before touching this code.  I have not yet seen any design
> for error recovery in the nested xact case, but I am sure it's going to
> need some changes right around here.

OK, the attached patch refactors the elog(FATAL)/SIGTERM exit to behave
just like a EOF from the client, with the exception of sending a proper
exit code.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/tcop/postgres.c,v
retrieving revision 1.398
diff -c -c -r1.398 postgres.c
*** src/backend/tcop/postgres.c    7 Apr 2004 05:05:49 -0000    1.398
--- src/backend/tcop/postgres.c    8 Apr 2004 18:59:33 -0000
***************
*** 2938,2944 ****
          /*
           * (3) read a command (loop blocks here)
           */
!         firstchar = ReadCommand(&input_message);

          /*
           * (4) disable async signal conditions again.
--- 2938,2947 ----
          /*
           * (3) read a command (loop blocks here)
           */
!          if (!in_fatal_exit)
!             firstchar = ReadCommand(&input_message);
!         else
!             firstchar = EOF;

          /*
           * (4) disable async signal conditions again.
***************
*** 3170,3176 ****
                   * Otherwise it will fail to be called during other
                   * backend-shutdown scenarios.
                   */
!                 proc_exit(0);

              case 'd':            /* copy data */
              case 'c':            /* copy done */
--- 3173,3180 ----
                   * Otherwise it will fail to be called during other
                   * backend-shutdown scenarios.
                   */
!                 proc_exit(!in_fatal_exit ? 0 : proc_exit_inprogress ||
!                                                 !IsUnderPostmaster);

              case 'd':            /* copy data */
              case 'c':            /* copy done */
Index: src/backend/utils/error/elog.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/error/elog.c,v
retrieving revision 1.132
diff -c -c -r1.132 elog.c
*** src/backend/utils/error/elog.c    5 Apr 2004 03:02:06 -0000    1.132
--- src/backend/utils/error/elog.c    8 Apr 2004 18:59:34 -0000
***************
*** 72,77 ****
--- 72,79 ----
  char       *Log_line_prefix = NULL; /* format for extra log line info */
  unsigned int Log_destination;

+ bool in_fatal_exit = false;
+
  #ifdef HAVE_SYSLOG
  char       *Syslog_facility;    /* openlog() parameters */
  char       *Syslog_ident;
***************
*** 442,448 ****
               */
              fflush(stdout);
              fflush(stderr);
!             proc_exit(proc_exit_inprogress || !IsUnderPostmaster);
          }

          /*
--- 444,453 ----
               */
              fflush(stdout);
              fflush(stderr);
!
!             if (in_fatal_exit)
!                 ereport(PANIC, (errmsg("fatal error during fatal exit, giving up")));
!             in_fatal_exit = true;
          }

          /*
Index: src/include/tcop/tcopprot.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/tcop/tcopprot.h,v
retrieving revision 1.64
diff -c -c -r1.64 tcopprot.h
*** src/include/tcop/tcopprot.h    7 Apr 2004 05:05:50 -0000    1.64
--- src/include/tcop/tcopprot.h    8 Apr 2004 18:59:36 -0000
***************
*** 34,39 ****
--- 34,40 ----
  extern DLLIMPORT const char *debug_query_string;
  extern char *rendezvous_name;
  extern int    max_stack_depth;
+ extern bool in_fatal_exit;

  /* GUC-configurable parameters */