Thread: Re: [HACKERS] Terminating a backend
Bruce Momjian wrote: > I have an idea for this TODO item: > > * Allow administrators to safely terminate individual sessions either > via an SQL function or SIGTERM > > Lock table corruption following SIGTERM of an individual backend > has been reported in 8.0. A possible cause was fixed in 8.1, but > it is unknown whether other problems exist. This item mostly > requires additional testing rather than of writing any new code. > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php > > When we get the termination signal, why can't we just set a global > boolean, do a query cancel, and in the setjmp() code block check the > global and exit --- at that stage we know we have released all locks and > can exit cleanly. I have implemented this idea with the attached patch. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/func.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.429 diff -c -c -r1.429 func.sgml *** doc/src/sgml/func.sgml 10 Apr 2008 13:34:33 -0000 1.429 --- doc/src/sgml/func.sgml 12 Apr 2008 13:48:33 -0000 *************** *** 11849,11854 **** --- 11849,11857 ---- <primary>pg_cancel_backend</primary> </indexterm> <indexterm> + <primary>pg_terminate_backend</primary> + </indexterm> + <indexterm> <primary>pg_reload_conf</primary> </indexterm> <indexterm> *************** *** 11885,11890 **** --- 11888,11900 ---- </row> <row> <entry> + <literal><function>pg_terminate_backend</function>(<parameter>pid</parameter> <type>int</>)</literal> + </entry> + <entry><type>boolean</type></entry> + <entry>Terminate a backend</entry> + </row> + <row> + <entry> <literal><function>pg_reload_conf</function>()</literal> </entry> <entry><type>boolean</type></entry> *************** *** 11909,11915 **** <para> <function>pg_cancel_backend</> sends a query cancel (<systemitem>SIGINT</>) signal to a backend process identified by ! process ID. The process ID of an active backend can be found from the <structfield>procpid</structfield> column in the <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command> processes on the server with --- 11919,11927 ---- <para> <function>pg_cancel_backend</> sends a query cancel (<systemitem>SIGINT</>) signal to a backend process identified by ! process ID. <function>pg_terminate_backend</> sends a terminate ! (<systemitem>SIGTERM</>) signal to the specified process ID. The ! process ID of an active backend can be found from the <structfield>procpid</structfield> column in the <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command> processes on the server with Index: src/backend/port/ipc_test.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/port/ipc_test.c,v retrieving revision 1.24 diff -c -c -r1.24 ipc_test.c *** src/backend/port/ipc_test.c 24 Mar 2008 18:08:47 -0000 1.24 --- src/backend/port/ipc_test.c 12 Apr 2008 13:48:33 -0000 *************** *** 40,45 **** --- 40,46 ---- volatile bool InterruptPending = false; volatile bool QueryCancelPending = false; + volatile bool BackendTerminatePending = false; volatile bool ProcDiePending = false; volatile bool ImmediateInterruptOK = false; volatile uint32 InterruptHoldoffCount = 0; Index: src/backend/postmaster/autovacuum.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.76 diff -c -c -r1.76 autovacuum.c *** src/backend/postmaster/autovacuum.c 26 Mar 2008 21:10:38 -0000 1.76 --- src/backend/postmaster/autovacuum.c 12 Apr 2008 13:48:33 -0000 *************** *** 1475,1481 **** * means abort and exit cleanly, and SIGQUIT means abandon ship. */ pqsignal(SIGINT, StatementCancelHandler); ! pqsignal(SIGTERM, die); pqsignal(SIGQUIT, quickdie); pqsignal(SIGALRM, handle_sig_alarm); --- 1475,1481 ---- * means abort and exit cleanly, and SIGQUIT means abandon ship. */ pqsignal(SIGINT, StatementCancelHandler); ! pqsignal(SIGTERM, BackendTerminateHandler); pqsignal(SIGQUIT, quickdie); pqsignal(SIGALRM, handle_sig_alarm); Index: src/backend/tcop/postgres.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v retrieving revision 1.548 diff -c -c -r1.548 postgres.c *** src/backend/tcop/postgres.c 2 Apr 2008 18:31:50 -0000 1.548 --- src/backend/tcop/postgres.c 12 Apr 2008 13:48:33 -0000 *************** *** 2541,2547 **** * waiting for input, however. */ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 && ! CritSectionCount == 0 && !DoingCommandRead) { /* bump holdoff count to make ProcessInterrupts() a no-op */ /* until we are done getting ready for it */ --- 2541,2548 ---- * waiting for input, however. */ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 && ! CritSectionCount == 0 && ! (!DoingCommandRead || BackendTerminatePending)) { /* bump holdoff count to make ProcessInterrupts() a no-op */ /* until we are done getting ready for it */ *************** *** 2557,2562 **** --- 2558,2575 ---- errno = save_errno; } + /* + * Backend terminate happens by canceling the current query, then + * exiting after the longjump() returns to the main query loop. + * This is done so that all resources are properly reset before exit. + */ + void + BackendTerminateHandler(SIGNAL_ARGS) + { + BackendTerminatePending = true; + StatementCancelHandler(postgres_signal_arg); + } + /* signal handler for floating point exception */ void FloatExceptionHandler(SIGNAL_ARGS) *************** *** 2621,2626 **** --- 2634,2643 ---- ereport(ERROR, (errcode(ERRCODE_QUERY_CANCELED), errmsg("canceling autovacuum task"))); + else if (BackendTerminatePending) + ereport(ERROR, + (errcode(ERRCODE_QUERY_CANCELED), + errmsg("terminating backend due to user request"))); else ereport(ERROR, (errcode(ERRCODE_QUERY_CANCELED), *************** *** 3154,3160 **** */ pqsignal(SIGHUP, SigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */ ! pqsignal(SIGTERM, die); /* cancel current query and exit */ /* * In a standalone backend, SIGQUIT can be generated from the keyboard --- 3171,3177 ---- */ pqsignal(SIGHUP, SigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */ ! pqsignal(SIGTERM, BackendTerminateHandler); /* * In a standalone backend, SIGQUIT can be generated from the keyboard *************** *** 3459,3464 **** --- 3476,3484 ---- /* We don't have a transaction command open anymore */ xact_started = false; + if (BackendTerminatePending) + die(SIGTERM); + /* Now we can allow interrupts again */ RESUME_INTERRUPTS(); } Index: src/backend/utils/adt/misc.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/misc.c,v retrieving revision 1.59 diff -c -c -r1.59 misc.c *** src/backend/utils/adt/misc.c 4 Apr 2008 16:57:21 -0000 1.59 --- src/backend/utils/adt/misc.c 12 Apr 2008 13:48:33 -0000 *************** *** 129,134 **** --- 129,140 ---- } Datum + pg_terminate_backend(PG_FUNCTION_ARGS) + { + PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM)); + } + + Datum pg_reload_conf(PG_FUNCTION_ARGS) { if (!superuser()) Index: src/backend/utils/init/globals.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/init/globals.c,v retrieving revision 1.105 diff -c -c -r1.105 globals.c *** src/backend/utils/init/globals.c 17 Feb 2008 02:09:29 -0000 1.105 --- src/backend/utils/init/globals.c 12 Apr 2008 13:48:33 -0000 *************** *** 27,32 **** --- 27,33 ---- volatile bool InterruptPending = false; volatile bool QueryCancelPending = false; + volatile bool BackendTerminatePending = false; volatile bool ProcDiePending = false; volatile bool ImmediateInterruptOK = false; volatile uint32 InterruptHoldoffCount = 0; Index: src/include/miscadmin.h =================================================================== RCS file: /cvsroot/pgsql/src/include/miscadmin.h,v retrieving revision 1.201 diff -c -c -r1.201 miscadmin.h *** src/include/miscadmin.h 20 Feb 2008 22:46:24 -0000 1.201 --- src/include/miscadmin.h 12 Apr 2008 13:48:33 -0000 *************** *** 68,73 **** --- 68,74 ---- /* these are marked volatile because they are set by signal handlers: */ extern PGDLLIMPORT volatile bool InterruptPending; extern volatile bool QueryCancelPending; + extern volatile bool BackendTerminatePending; extern volatile bool ProcDiePending; /* these are marked volatile because they are examined by signal handlers: */ Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /cvsroot/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.488 diff -c -c -r1.488 pg_proc.h *** src/include/catalog/pg_proc.h 10 Apr 2008 22:25:25 -0000 1.488 --- src/include/catalog/pg_proc.h 12 Apr 2008 13:48:33 -0000 *************** *** 3157,3162 **** --- 3157,3164 ---- DATA(insert OID = 2171 ( pg_cancel_backend PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_ pg_cancel_backend- _null_ _null_ )); DESCR("cancel a server process' current query"); + DATA(insert OID = 2096 ( pg_terminate_backend PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_ pg_terminate_backend- _null_ _null_ )); + DESCR("terminate a server process"); DATA(insert OID = 2172 ( pg_start_backup PGNSP PGUID 12 1 0 f f t f v 1 25 "25" _null_ _null_ _null_ pg_start_backup- _null_ _null_ )); DESCR("prepare for taking an online backup"); DATA(insert OID = 2173 ( pg_stop_backup PGNSP PGUID 12 1 0 f f t f v 0 25 "" _null_ _null_ _null_ pg_stop_backup- _null_ _null_ )); Index: src/include/tcop/tcopprot.h =================================================================== RCS file: /cvsroot/pgsql/src/include/tcop/tcopprot.h,v retrieving revision 1.93 diff -c -c -r1.93 tcopprot.h *** src/include/tcop/tcopprot.h 10 Mar 2008 12:55:13 -0000 1.93 --- src/include/tcop/tcopprot.h 12 Apr 2008 13:48:33 -0000 *************** *** 60,65 **** --- 60,66 ---- extern void quickdie(SIGNAL_ARGS); extern void authdie(SIGNAL_ARGS); extern void StatementCancelHandler(SIGNAL_ARGS); + extern void BackendTerminateHandler(SIGNAL_ARGS); extern void FloatExceptionHandler(SIGNAL_ARGS); extern void prepare_for_client_read(void); extern void client_read_ended(void); Index: src/include/utils/builtins.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/builtins.h,v retrieving revision 1.312 diff -c -c -r1.312 builtins.h *** src/include/utils/builtins.h 4 Apr 2008 18:45:36 -0000 1.312 --- src/include/utils/builtins.h 12 Apr 2008 13:48:33 -0000 *************** *** 416,421 **** --- 416,422 ---- extern Datum current_database(PG_FUNCTION_ARGS); extern Datum current_query(PG_FUNCTION_ARGS); extern Datum pg_cancel_backend(PG_FUNCTION_ARGS); + extern Datum pg_terminate_backend(PG_FUNCTION_ARGS); extern Datum pg_reload_conf(PG_FUNCTION_ARGS); extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS); extern Datum pg_rotate_logfile(PG_FUNCTION_ARGS);
Bruce Momjian wrote: > Bruce Momjian wrote: > > I have an idea for this TODO item: > > > > * Allow administrators to safely terminate individual sessions either > > via an SQL function or SIGTERM > > > > Lock table corruption following SIGTERM of an individual backend > > has been reported in 8.0. A possible cause was fixed in 8.1, but > > it is unknown whether other problems exist. This item mostly > > requires additional testing rather than of writing any new code. > > > > http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php > > > > When we get the termination signal, why can't we just set a global > > boolean, do a query cancel, and in the setjmp() code block check the > > global and exit --- at that stage we know we have released all locks and > > can exit cleanly. > > I have implemented this idea with the attached patch. One problem I have with my patch is that SIGTERM is currently used by the postmaster to shut down backends. Now because the postmaster knows that all backend are terminating, it can accept a dirtier shutdown than one where we are terminating just one backend and the rest are going to keep running. The new SIGTERM coding is going to exit a backend only in a place where cancel is checked. I need some thoughts in this area. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: >> When we get the termination signal, why can't we just set a global >> boolean, do a query cancel, and in the setjmp() code block check the >> global and exit --- at that stage we know we have released all locks and >> can exit cleanly. > I have implemented this idea with the attached patch. It was already explained to you why this is a bad idea. regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > >> When we get the termination signal, why can't we just set a global > >> boolean, do a query cancel, and in the setjmp() code block check the > >> global and exit --- at that stage we know we have released all locks and > >> can exit cleanly. > > > I have implemented this idea with the attached patch. > > It was already explained to you why this is a bad idea. Yes, I remember your comments: http://archives.postgresql.org/pgsql-hackers/2008-03/msg00145.php Keep in mind that 99% of the excuse for people to want to use SIGTERM is that the backend isn't responding to SIGINT. If you've fixed things so that SIGTERM cannot get them out of any situation that SIGINT doesn't get them out of, I don't think it's a step forward. ... [shrug...] They can do that now, most of the time. What this is about is dealing with corner cases, and in that respect what your proposal will do is replace soluble problems with insoluble ones. But I suppose I can't stop you if you're insistent. I need more than one person to tell me it is a bad idea because I think it is a good idea. If we tell people SIGTERM is not safe to use then why is making it safe worse. And if it is safe, we can just add a function to enable SIGTERM to the backend without doing the query cancel longjump(). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > > > When we get the termination signal, why can't we just set a global > > > boolean, do a query cancel, and in the setjmp() code block check the > > > global and exit --- at that stage we know we have released all locks and > > > can exit cleanly. > > > > I have implemented this idea with the attached patch. > > One problem I have with my patch is that SIGTERM is currently used by > the postmaster to shut down backends. Now because the postmaster knows > that all backend are terminating, it can accept a dirtier shutdown than > one where we are terminating just one backend and the rest are going to > keep running. The new SIGTERM coding is going to exit a backend only in > a place where cancel is checked. I have a idea --- to have pg_terminate_backend() set a PGPROC boolean and then send a query cancel signal to the backend --- the backend can then check the boolean and exit if required. I will work on a new version of this patch tomorrow/Monday. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > I have a idea --- to have pg_terminate_backend() set a PGPROC boolean > and then send a query cancel signal to the backend --- the backend can > then check the boolean and exit if required. I will work on a new > version of this patch tomorrow/Monday. That's fine, even if it fails on 0.1% of the cases, but you can't use SIGTERM to implement that because it already has a different meaning. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
oikBruce Momjian wrote: > Bruce Momjian wrote: > > > > When we get the termination signal, why can't we just set a global > > > > boolean, do a query cancel, and in the setjmp() code block check the > > > > global and exit --- at that stage we know we have released all locks and > > > > can exit cleanly. > > > > > > I have implemented this idea with the attached patch. > > > > One problem I have with my patch is that SIGTERM is currently used by > > the postmaster to shut down backends. Now because the postmaster knows > > that all backend are terminating, it can accept a dirtier shutdown than > > one where we are terminating just one backend and the rest are going to > > keep running. The new SIGTERM coding is going to exit a backend only in > > a place where cancel is checked. > > I have a idea --- to have pg_terminate_backend() set a PGPROC boolean > and then send a query cancel signal to the backend --- the backend can > then check the boolean and exit if required. I will work on a new > version of this patch tomorrow/Monday. Updated patch attached. I didn't modify SIGTERM at all but set a PRPROC boolean and piggybacked on SIGINT. I think I got the PGPROC locking right. I had to split apart pg_signal_backend() so I could do the permission and pid checks independent of the signalling, because I pg_terminate_backend() needs to check, then set the PGPROC variable, then send the signal. I also added an administration doc mention about when to use pg_terminate_backend(). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/func.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.429 diff -c -c -r1.429 func.sgml *** doc/src/sgml/func.sgml 10 Apr 2008 13:34:33 -0000 1.429 --- doc/src/sgml/func.sgml 14 Apr 2008 17:19:10 -0000 *************** *** 11849,11854 **** --- 11849,11857 ---- <primary>pg_cancel_backend</primary> </indexterm> <indexterm> + <primary>pg_terminate_backend</primary> + </indexterm> + <indexterm> <primary>pg_reload_conf</primary> </indexterm> <indexterm> *************** *** 11885,11890 **** --- 11888,11900 ---- </row> <row> <entry> + <literal><function>pg_terminate_backend</function>(<parameter>pid</parameter> <type>int</>)</literal> + </entry> + <entry><type>boolean</type></entry> + <entry>Terminate a backend</entry> + </row> + <row> + <entry> <literal><function>pg_reload_conf</function>()</literal> </entry> <entry><type>boolean</type></entry> *************** *** 11907,11915 **** </para> <para> ! <function>pg_cancel_backend</> sends a query cancel ! (<systemitem>SIGINT</>) signal to a backend process identified by ! process ID. The process ID of an active backend can be found from the <structfield>procpid</structfield> column in the <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command> processes on the server with --- 11917,11926 ---- </para> <para> ! <function>pg_cancel_backend</> and <function>pg_terminate_backend</> ! send a query cancel (<systemitem>SIGINT</>) signal to a backend process ! identified by process ID. The ! process ID of an active backend can be found from the <structfield>procpid</structfield> column in the <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command> processes on the server with Index: doc/src/sgml/runtime.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v retrieving revision 1.411 diff -c -c -r1.411 runtime.sgml *** doc/src/sgml/runtime.sgml 31 Mar 2008 02:43:14 -0000 1.411 --- doc/src/sgml/runtime.sgml 14 Apr 2008 17:19:10 -0000 *************** *** 1372,1377 **** --- 1372,1384 ---- well. </para> </important> + + <para> + To terminate a session while allowing other sessions to continue, use + <function>pg_terminate_backend()</> (<xref + linkend="functions-admin-signal-table">) rather than sending a signal + to the child process. + </para> </sect1> <sect1 id="preventing-server-spoofing"> Index: src/backend/tcop/postgres.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v retrieving revision 1.548 diff -c -c -r1.548 postgres.c *** src/backend/tcop/postgres.c 2 Apr 2008 18:31:50 -0000 1.548 --- src/backend/tcop/postgres.c 14 Apr 2008 17:19:10 -0000 *************** *** 2541,2547 **** * waiting for input, however. */ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 && ! CritSectionCount == 0 && !DoingCommandRead) { /* bump holdoff count to make ProcessInterrupts() a no-op */ /* until we are done getting ready for it */ --- 2541,2548 ---- * waiting for input, however. */ if (ImmediateInterruptOK && InterruptHoldoffCount == 0 && ! CritSectionCount == 0 && ! (!DoingCommandRead || MyProc->terminate)) { /* bump holdoff count to make ProcessInterrupts() a no-op */ /* until we are done getting ready for it */ *************** *** 2621,2626 **** --- 2622,2631 ---- ereport(ERROR, (errcode(ERRCODE_QUERY_CANCELED), errmsg("canceling autovacuum task"))); + else if (MyProc->terminate) + ereport(ERROR, + (errcode(ERRCODE_ADMIN_SHUTDOWN), + errmsg("terminating backend due to administrator command"))); else ereport(ERROR, (errcode(ERRCODE_QUERY_CANCELED), *************** *** 3459,3464 **** --- 3464,3472 ---- /* We don't have a transaction command open anymore */ xact_started = false; + if (MyProc->terminate) + die(SIGINT); + /* Now we can allow interrupts again */ RESUME_INTERRUPTS(); } Index: src/backend/utils/adt/misc.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/misc.c,v retrieving revision 1.59 diff -c -c -r1.59 misc.c *** src/backend/utils/adt/misc.c 4 Apr 2008 16:57:21 -0000 1.59 --- src/backend/utils/adt/misc.c 14 Apr 2008 17:19:11 -0000 *************** *** 27,32 **** --- 27,33 ---- #include "postmaster/syslogger.h" #include "storage/fd.h" #include "storage/pmsignal.h" + #include "storage/proc.h" #include "storage/procarray.h" #include "utils/builtins.h" #include "tcop/tcopprot.h" *************** *** 89,95 **** * Functions to send signals to other backends. */ static bool ! pg_signal_backend(int pid, int sig) { if (!superuser()) ereport(ERROR, --- 90,96 ---- * Functions to send signals to other backends. */ static bool ! pg_signal_check(int pid) { if (!superuser()) ereport(ERROR, *************** *** 106,112 **** --- 107,122 ---- (errmsg("PID %d is not a PostgreSQL server process", pid))); return false; } + else + return true; + } + /* + * Functions to send signals to other backends. + */ + static bool + pg_signal_backend(int pid, int sig) + { /* If we have setsid(), signal the backend's whole process group */ #ifdef HAVE_SETSID if (kill(-pid, sig)) *************** *** 125,131 **** Datum pg_cancel_backend(PG_FUNCTION_ARGS) { ! PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGINT)); } Datum --- 135,177 ---- Datum pg_cancel_backend(PG_FUNCTION_ARGS) { ! int pid = PG_GETARG_INT32(0); ! ! if (pg_signal_check(pid)) ! PG_RETURN_BOOL(pg_signal_backend(pid, SIGINT)); ! else ! PG_RETURN_BOOL(false); ! } ! ! /* ! * To cleanly terminate a backend, we set PGPROC(pid)->terminate ! * then send a cancel signal. We get ProcArrayLock only when ! * setting PGPROC->terminate so the function might fail in ! * several places, but that is fine because in those cases the ! * backend is already gone. ! */ ! Datum ! pg_terminate_backend(PG_FUNCTION_ARGS) ! { ! int pid = PG_GETARG_INT32(0); ! volatile PGPROC *term_proc; ! ! /* Is this the super-user, and can we find the PGPROC entry for the pid? */ ! if (pg_signal_check(pid) && (term_proc = BackendPidGetProc(pid)) != NULL) ! { ! LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); ! /* Recheck now that we have the ProcArray lock. */ ! if (term_proc->pid == pid) ! { ! term_proc->terminate = true; ! LWLockRelease(ProcArrayLock); ! PG_RETURN_BOOL(pg_signal_backend(pid, SIGINT)); ! } ! else ! LWLockRelease(ProcArrayLock); ! } ! ! PG_RETURN_BOOL(false); } Datum *************** *** 169,185 **** PG_RETURN_BOOL(true); } - #ifdef NOT_USED - - /* Disabled in 8.0 due to reliability concerns; FIXME someday */ - Datum - pg_terminate_backend(PG_FUNCTION_ARGS) - { - PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM)); - } - #endif - - /* Function to find out which databases make use of a tablespace */ typedef struct --- 215,220 ---- Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /cvsroot/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.488 diff -c -c -r1.488 pg_proc.h *** src/include/catalog/pg_proc.h 10 Apr 2008 22:25:25 -0000 1.488 --- src/include/catalog/pg_proc.h 14 Apr 2008 17:19:11 -0000 *************** *** 3157,3162 **** --- 3157,3164 ---- DATA(insert OID = 2171 ( pg_cancel_backend PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_ pg_cancel_backend- _null_ _null_ )); DESCR("cancel a server process' current query"); + DATA(insert OID = 2096 ( pg_terminate_backend PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_ pg_terminate_backend- _null_ _null_ )); + DESCR("terminate a server process"); DATA(insert OID = 2172 ( pg_start_backup PGNSP PGUID 12 1 0 f f t f v 1 25 "25" _null_ _null_ _null_ pg_start_backup- _null_ _null_ )); DESCR("prepare for taking an online backup"); DATA(insert OID = 2173 ( pg_stop_backup PGNSP PGUID 12 1 0 f f t f v 0 25 "" _null_ _null_ _null_ pg_stop_backup- _null_ _null_ )); Index: src/include/storage/proc.h =================================================================== RCS file: /cvsroot/pgsql/src/include/storage/proc.h,v retrieving revision 1.104 diff -c -c -r1.104 proc.h *** src/include/storage/proc.h 26 Jan 2008 19:55:08 -0000 1.104 --- src/include/storage/proc.h 14 Apr 2008 17:19:11 -0000 *************** *** 91,96 **** --- 91,98 ---- bool inCommit; /* true if within commit critical section */ + bool terminate; /* admin requested termination */ + uint8 vacuumFlags; /* vacuum-related flags, see above */ /* Info about LWLock the process is currently waiting for, if any. */ Index: src/include/utils/builtins.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/builtins.h,v retrieving revision 1.312 diff -c -c -r1.312 builtins.h *** src/include/utils/builtins.h 4 Apr 2008 18:45:36 -0000 1.312 --- src/include/utils/builtins.h 14 Apr 2008 17:19:11 -0000 *************** *** 416,421 **** --- 416,422 ---- extern Datum current_database(PG_FUNCTION_ARGS); extern Datum current_query(PG_FUNCTION_ARGS); extern Datum pg_cancel_backend(PG_FUNCTION_ARGS); + extern Datum pg_terminate_backend(PG_FUNCTION_ARGS); extern Datum pg_reload_conf(PG_FUNCTION_ARGS); extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS); extern Datum pg_rotate_logfile(PG_FUNCTION_ARGS);
Applied. --------------------------------------------------------------------------- Bruce Momjian wrote: > oikBruce Momjian wrote: > > Bruce Momjian wrote: > > > > > When we get the termination signal, why can't we just set a global > > > > > boolean, do a query cancel, and in the setjmp() code block check the > > > > > global and exit --- at that stage we know we have released all locks and > > > > > can exit cleanly. > > > > > > > > I have implemented this idea with the attached patch. > > > > > > One problem I have with my patch is that SIGTERM is currently used by > > > the postmaster to shut down backends. Now because the postmaster knows > > > that all backend are terminating, it can accept a dirtier shutdown than > > > one where we are terminating just one backend and the rest are going to > > > keep running. The new SIGTERM coding is going to exit a backend only in > > > a place where cancel is checked. > > > > I have a idea --- to have pg_terminate_backend() set a PGPROC boolean > > and then send a query cancel signal to the backend --- the backend can > > then check the boolean and exit if required. I will work on a new > > version of this patch tomorrow/Monday. > > Updated patch attached. I didn't modify SIGTERM at all but set a PRPROC > boolean and piggybacked on SIGINT. I think I got the PGPROC locking > right. I had to split apart pg_signal_backend() so I could do the > permission and pid checks independent of the signalling, because I > pg_terminate_backend() needs to check, then set the PGPROC variable, > then send the signal. > > I also added an administration doc mention about when to use > pg_terminate_backend(). > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > -- > Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-patches -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +