Thread: How to know killed by pg_terminate_backend

How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
Hi,

If a backend killed by pg_terminate_backend(), the backend returns
57P01 which is identical to the one when it's killed by postmaster.

Problem is, pgpool-II needs to trigger failover if postmaster goes
down because apparently pgpool-II cannot use the PostgreSQL server
anymore.

On the otherhand, pg_terminate_backend() just terminates a backend. So
triggering failover is overkill.

Maybe we could make PostgreSQL a little bit smarter so that it returns
a different code than 57P01 when killed by pg_terminate_backend().

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: How to know killed by pg_terminate_backend

From
Heikki Linnakangas
Date:
Tatsuo Ishii wrote:
> If a backend killed by pg_terminate_backend(), the backend returns
> 57P01 which is identical to the one when it's killed by postmaster.
> 
> Problem is, pgpool-II needs to trigger failover if postmaster goes
> down because apparently pgpool-II cannot use the PostgreSQL server
> anymore.
> 
> On the otherhand, pg_terminate_backend() just terminates a backend. So
> triggering failover is overkill.
> 
> Maybe we could make PostgreSQL a little bit smarter so that it returns
> a different code than 57P01 when killed by pg_terminate_backend().

Seems reasonable. Does the victim backend currently know why it has been
killed?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
> > Maybe we could make PostgreSQL a little bit smarter so that it returns
> > a different code than 57P01 when killed by pg_terminate_backend().
> 
> Seems reasonable. Does the victim backend currently know why it has been
> killed?

I don't think so.

One idea is postmaster sets a flag in the shared memory area
indicating it rceived SIGTERM before forwarding the signal to
backends.

Backend check the flag and if it's not set, it knows that the signal
has been sent by pg_terminate_backend(), not postmaster.

What about new error code:

#define ERRCODE_BACKEND_STOP_REQUEST                MAKE_SQLSTATE('5','7', 'P','0','4')
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: How to know killed by pg_terminate_backend

From
Robert Haas
Date:
On Thu, May 13, 2010 at 8:20 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> > Maybe we could make PostgreSQL a little bit smarter so that it returns
>> > a different code than 57P01 when killed by pg_terminate_backend().
>>
>> Seems reasonable. Does the victim backend currently know why it has been
>> killed?
>
> I don't think so.
>
> One idea is postmaster sets a flag in the shared memory area
> indicating it rceived SIGTERM before forwarding the signal to
> backends.
>
> Backend check the flag and if it's not set, it knows that the signal
> has been sent by pg_terminate_backend(), not postmaster.

Or it could also be sent by some other user process, like the user
running "kill" from the shell.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
> >> Seems reasonable. Does the victim backend currently know why it has been
> >> killed?
> >
> > I don't think so.
> >
> > One idea is postmaster sets a flag in the shared memory area
> > indicating it rceived SIGTERM before forwarding the signal to
> > backends.
> >
> > Backend check the flag and if it's not set, it knows that the signal
> > has been sent by pg_terminate_backend(), not postmaster.
> 
> Or it could also be sent by some other user process, like the user
> running "kill" from the shell.

No problem (at least for pgpool-II).

If the flag is not set, postgres returns the same code as the one
killed by pg_terminate_backend(). The point is, backend is killed by
postmaster or not. Because if backend was killed by postmaster,
pgpool-II should not expect the PostgreSQL server is usable since
postmaster decided to shutdown.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
>> >> Seems reasonable. Does the victim backend currently know why it has been
>> >> killed?
>> >
>> > I don't think so.
>> >
>> > One idea is postmaster sets a flag in the shared memory area
>> > indicating it rceived SIGTERM before forwarding the signal to
>> > backends.
>> >
>> > Backend check the flag and if it's not set, it knows that the signal
>> > has been sent by pg_terminate_backend(), not postmaster.
>> 
>> Or it could also be sent by some other user process, like the user
>> running "kill" from the shell.
> 
> No problem (at least for pgpool-II).
> 
> If the flag is not set, postgres returns the same code as the one
> killed by pg_terminate_backend(). The point is, backend is killed by
> postmaster or not. Because if backend was killed by postmaster,
> pgpool-II should not expect the PostgreSQL server is usable since
> postmaster decided to shutdown.

Here is the patch to implement the feature.

1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to  the target backend.
2) The infrastructure used for message passing is  storage/ipc/procsignal.c The new message type for ProcSignalReason
is"PROCSIG_TERMNINATE_BACKEND_INTERRUPT"3) I assign new error code 57P04 which is returned from the backend    killed
bypg_terminate_backend().
 

#define ERRCODE_TERMINATE_BACKEND            MAKE_SQLSTATE('5','7', 'P','0','4')

Comments are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 279,284 **** procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 279,287 ----     if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
 
+     if (CheckProcSignal(PROCSIG_TERMNINATE_BACKEND_INTERRUPT))
+         HandleTerminateBackendInterrupt();
+      latch_sigusr1_handler();      errno = save_errno;
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 184,189 **** static bool RecoveryConflictPending = false;
--- 184,195 ---- static bool RecoveryConflictRetryable = true; static ProcSignalReason RecoveryConflictReason; 
+ /*
+  * True if backend is being killed by pg_terminate_backend().
+  * Set by HandleTerminateBackendInterrupt() upon received SIGUSR1.
+  */
+ static bool TerminateBackendRequest = false;
+  /* ----------------------------------------------------------------  *        decls for routines only used in this
file * ----------------------------------------------------------------
 
***************
*** 2875,2880 **** RecoveryConflictInterrupt(ProcSignalReason reason)
--- 2881,2924 ---- }  /*
+  * HandleTerminateBackendInterrupt: out-of-line portion of terminate backend
+  * handling following receipt of SIGUSR1. Designed to be similar to die().
+  * Called only by a normal user backend.
+  */
+ void
+ HandleTerminateBackendInterrupt(void)
+ {
+     int            save_errno = errno;
+ 
+     /* Don't joggle the elbow of proc_exit */
+     if (!proc_exit_inprogress)
+     {
+         InterruptPending = true;
+         ProcDiePending = true;
+         TerminateBackendRequest = true;
+ 
+         /*
+          * If it's safe to interrupt, and we're waiting for input or a lock,
+          * service the interrupt immediately
+          */
+         if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+             CritSectionCount == 0)
+         {
+             /* bump holdoff count to make ProcessInterrupts() a no-op */
+             /* until we are done getting ready for it */
+             InterruptHoldoffCount++;
+             LockWaitCancel();    /* prevent CheckDeadLock from running */
+             DisableNotifyInterrupt();
+             DisableCatchupInterrupt();
+             InterruptHoldoffCount--;
+             ProcessInterrupts();
+         }
+     }
+ 
+     errno = save_errno;
+ }
+ 
+ /*  * ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro  *  * If an interrupt condition is
pending,and it's safe to service it,
 
***************
*** 2912,2917 **** ProcessInterrupts(void)
--- 2956,2966 ----                     (errcode(ERRCODE_ADMIN_SHUTDOWN),               errmsg("terminating connection
dueto conflict with recovery"),                      errdetail_recovery_conflict()));
 
+         else if (TerminateBackendRequest)
+             ereport(FATAL,
+                     (errcode(ERRCODE_TERMINATE_BACKEND),
+                      errmsg("terminating connection due to pg_terminate_backend")));
+          else             ereport(FATAL,                     (errcode(ERRCODE_ADMIN_SHUTDOWN),
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***************
*** 114,120 **** pg_cancel_backend(PG_FUNCTION_ARGS) Datum pg_terminate_backend(PG_FUNCTION_ARGS) {
!     PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM)); }  Datum
--- 114,122 ---- Datum pg_terminate_backend(PG_FUNCTION_ARGS) {
!     PG_RETURN_BOOL(
!         SendProcSignal(PG_GETARG_INT32(0), PROCSIG_TERMNINATE_BACKEND_INTERRUPT,
!                        InvalidBackendId) == 0); }  Datum
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 40,45 **** typedef enum
--- 40,47 ----     PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,     PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK, 
+     PROCSIG_TERMNINATE_BACKEND_INTERRUPT,    /* terminate request from pg_terminate_backend() */
+      NUM_PROCSIGNALS                /* Must be last! */ } ProcSignalReason; 
***************
*** 55,58 **** extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
--- 57,62 ----  extern void procsignal_sigusr1_handler(SIGNAL_ARGS); 
+ extern void HandleTerminateBackendInterrupt(void);
+  #endif   /* PROCSIGNAL_H */
*** a/src/include/utils/errcodes.h
--- b/src/include/utils/errcodes.h
***************
*** 332,337 ****
--- 332,338 ---- #define ERRCODE_ADMIN_SHUTDOWN                MAKE_SQLSTATE('5','7', 'P','0','1') #define
ERRCODE_CRASH_SHUTDOWN               MAKE_SQLSTATE('5','7', 'P','0','2') #define ERRCODE_CANNOT_CONNECT_NOW
MAKE_SQLSTATE('5','7','P','0','3')
 
+ #define ERRCODE_TERMINATE_BACKEND            MAKE_SQLSTATE('5','7', 'P','0','4')  /* Class 58 - System Error (class
borrowedfrom DB2) */ /* (we define this as errors external to PostgreSQL itself) */ 

Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
>> >> Seems reasonable. Does the victim backend currently know why it has been
>> >> killed?
>> >
>> > I don't think so.
>> >
>> > One idea is postmaster sets a flag in the shared memory area
>> > indicating it rceived SIGTERM before forwarding the signal to
>> > backends.
>> >
>> > Backend check the flag and if it's not set, it knows that the signal
>> > has been sent by pg_terminate_backend(), not postmaster.
>> 
>> Or it could also be sent by some other user process, like the user
>> running "kill" from the shell.
> 
> No problem (at least for pgpool-II).
> 
> If the flag is not set, postgres returns the same code as the one
> killed by pg_terminate_backend(). The point is, backend is killed by
> postmaster or not. Because if backend was killed by postmaster,
> pgpool-II should not expect the PostgreSQL server is usable since
> postmaster decided to shutdown.

Here is the patch to implement the feature.

1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to  the target backend.
2) The infrastructure used for message passing is  storage/ipc/procsignal.c The new message type for ProcSignalReason
is"PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
 
3) I assign new error code 57P04 which is returned from the backend  killed by pg_terminate_backend().

#define ERRCODE_TERMINATE_BACKEND            MAKE_SQLSTATE('5','7', 'P','0','4')

Comments are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 279,284 **** procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 279,287 ----     if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
 
+     if (CheckProcSignal(PROCSIG_TERMNINATE_BACKEND_INTERRUPT))
+         HandleTerminateBackendInterrupt();
+      latch_sigusr1_handler();      errno = save_errno;
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 184,189 **** static bool RecoveryConflictPending = false;
--- 184,195 ---- static bool RecoveryConflictRetryable = true; static ProcSignalReason RecoveryConflictReason; 
+ /*
+  * True if backend is being killed by pg_terminate_backend().
+  * Set by HandleTerminateBackendInterrupt() upon received SIGUSR1.
+  */
+ static bool TerminateBackendRequest = false;
+  /* ----------------------------------------------------------------  *        decls for routines only used in this
file * ----------------------------------------------------------------
 
***************
*** 2875,2880 **** RecoveryConflictInterrupt(ProcSignalReason reason)
--- 2881,2924 ---- }  /*
+  * HandleTerminateBackendInterrupt: out-of-line portion of terminate backend
+  * handling following receipt of SIGUSR1. Designed to be similar to die().
+  * Called only by a normal user backend.
+  */
+ void
+ HandleTerminateBackendInterrupt(void)
+ {
+     int            save_errno = errno;
+ 
+     /* Don't joggle the elbow of proc_exit */
+     if (!proc_exit_inprogress)
+     {
+         InterruptPending = true;
+         ProcDiePending = true;
+         TerminateBackendRequest = true;
+ 
+         /*
+          * If it's safe to interrupt, and we're waiting for input or a lock,
+          * service the interrupt immediately
+          */
+         if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+             CritSectionCount == 0)
+         {
+             /* bump holdoff count to make ProcessInterrupts() a no-op */
+             /* until we are done getting ready for it */
+             InterruptHoldoffCount++;
+             LockWaitCancel();    /* prevent CheckDeadLock from running */
+             DisableNotifyInterrupt();
+             DisableCatchupInterrupt();
+             InterruptHoldoffCount--;
+             ProcessInterrupts();
+         }
+     }
+ 
+     errno = save_errno;
+ }
+ 
+ /*  * ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro  *  * If an interrupt condition is
pending,and it's safe to service it,
 
***************
*** 2912,2917 **** ProcessInterrupts(void)
--- 2956,2966 ----                     (errcode(ERRCODE_ADMIN_SHUTDOWN),               errmsg("terminating connection
dueto conflict with recovery"),                      errdetail_recovery_conflict()));
 
+         else if (TerminateBackendRequest)
+             ereport(FATAL,
+                     (errcode(ERRCODE_TERMINATE_BACKEND),
+                      errmsg("terminating connection due to pg_terminate_backend")));
+          else             ereport(FATAL,                     (errcode(ERRCODE_ADMIN_SHUTDOWN),
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***************
*** 114,120 **** pg_cancel_backend(PG_FUNCTION_ARGS) Datum pg_terminate_backend(PG_FUNCTION_ARGS) {
!     PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM)); }  Datum
--- 114,122 ---- Datum pg_terminate_backend(PG_FUNCTION_ARGS) {
!     PG_RETURN_BOOL(
!         SendProcSignal(PG_GETARG_INT32(0), PROCSIG_TERMNINATE_BACKEND_INTERRUPT,
!                        InvalidBackendId) == 0); }  Datum
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 40,45 **** typedef enum
--- 40,47 ----     PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,     PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK, 
+     PROCSIG_TERMNINATE_BACKEND_INTERRUPT,    /* terminate request from pg_terminate_backend() */
+      NUM_PROCSIGNALS                /* Must be last! */ } ProcSignalReason; 
***************
*** 55,58 **** extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
--- 57,62 ----  extern void procsignal_sigusr1_handler(SIGNAL_ARGS); 
+ extern void HandleTerminateBackendInterrupt(void);
+  #endif   /* PROCSIGNAL_H */
*** a/src/include/utils/errcodes.h
--- b/src/include/utils/errcodes.h
***************
*** 332,337 ****
--- 332,338 ---- #define ERRCODE_ADMIN_SHUTDOWN                MAKE_SQLSTATE('5','7', 'P','0','1') #define
ERRCODE_CRASH_SHUTDOWN               MAKE_SQLSTATE('5','7', 'P','0','2') #define ERRCODE_CANNOT_CONNECT_NOW
MAKE_SQLSTATE('5','7','P','0','3')
 
+ #define ERRCODE_TERMINATE_BACKEND            MAKE_SQLSTATE('5','7', 'P','0','4')  /* Class 58 - System Error (class
borrowedfrom DB2) */ /* (we define this as errors external to PostgreSQL itself) */ 

Re: How to know killed by pg_terminate_backend

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> Comments are welcome.

This is a bad idea.  It makes an already-poorly-tested code path
significantly more fragile, in return for nothing of value.
        regards, tom lane


Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <ishii@postgresql.org> writes:
>> Comments are welcome.
> 
> This is a bad idea.  It makes an already-poorly-tested code path
> significantly more fragile, in return for nothing of value.

Are you saying that procsignal.c is the already-poorly-tested one? If
so, why?

As for "value", I have already explained why we need this in the
upthread.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: How to know killed by pg_terminate_backend

From
Tatsuo Ishii
Date:
> Here is the patch to implement the feature.
> 
> 1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
>    the target backend.
> 2) The infrastructure used for message passing is
>    storage/ipc/procsignal.c The new message type for ProcSignalReason
>    is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
>  3) I assign new error code 57P04 which is returned from the backend
>      killed by pg_terminate_backend().
> 
> #define ERRCODE_TERMINATE_BACKEND            MAKE_SQLSTATE('5','7', 'P','0','4')
> 
> Comments are welcome.

Anyone has better idea? Tom dislikes my patch but I don't know how to
deal with it.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


Re: How to know killed by pg_terminate_backend

From
Itagaki Takahiro
Date:
On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> Here is the patch to implement the feature.
>>
>> 1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to
>>    the target backend.
>> 2) The infrastructure used for message passing is
>>    storage/ipc/procsignal.c The new message type for ProcSignalReason
>>    is "PROCSIG_TERMNINATE_BACKEND_INTERRUPT"
>>  3) I assign new error code 57P04 which is returned from the backend
>>       killed by pg_terminate_backend().
>>
>> #define ERRCODE_TERMINATE_BACKEND                     MAKE_SQLSTATE('5','7', 'P','0','4')
>
> Anyone has better idea? Tom dislikes my patch but I don't know how to
> deal with it.

There was another design in the past discussion:
>> One idea is postmaster sets a flag in the shared memory area
>> indicating it rceived SIGTERM before forwarding the signal to
>> backends.

Is it enough for your purpose and do we think it is more robust way?

--
Itagaki Takahiro


Re: How to know killed by pg_terminate_backend

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> Anyone has better idea? Tom dislikes my patch but I don't know how to
>> deal with it.

> There was another design in the past discussion:
> One idea is postmaster sets a flag in the shared memory area
> indicating it rceived SIGTERM before forwarding the signal to
> backends.

> Is it enough for your purpose and do we think it is more robust way?

To put this as briefly as possible: I don't want to add even one line of
code to distinguish pg_terminate_backend from database-wide shutdown.
That function should be a last-ditch tool, not something used on a daily
basis.  So I disagree with the premise as much as with any particular
implementation.
        regards, tom lane


Re: How to know killed by pg_terminate_backend

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
>> On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>> Anyone has better idea? Tom dislikes my patch but I don't know how to
>>> deal with it.
>
>> There was another design in the past discussion:
>> One idea is postmaster sets a flag in the shared memory area
>> indicating it rceived SIGTERM before forwarding the signal to
>> backends.
>
>> Is it enough for your purpose and do we think it is more robust way?
>
> To put this as briefly as possible: I don't want to add even one line of
> code to distinguish pg_terminate_backend from database-wide shutdown.
> That function should be a last-ditch tool, not something used on a daily
> basis.  So I disagree with the premise as much as with any particular
> implementation.

Well, that seems awfully unfriendly.

Frequency of use is beside the point - people are trying to write
client applications - like pgpool-II - that understand the behavior of
PG.  If we send the same error code in two different situations with
different behaviors, such applications have to do so silly workarounds
to figure out what really happened.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company