Thread: Multiplexing SUGUSR1

Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
I've been looking at the signal handling part of the synchronous
replication patch. It looks OK, but one thing makes me worry.

To set or clear the flag from PGPROC, to send or handle a signal, we
have to acquire ProcArrayLock. Is that safe to do in a signal handler?
And is the performance impact of that acceptable?


Another observation is that the patch introduces a new function called
SendProcSignal. Nothing wrong with that, except that there's an existing
function called ProcSendSignal, just above SendProcSignal, so there's
some potential for confusion. The old ProcSendSignal function uses the
per-backend semaphore to wake up a backend. It's only used to wait for
the cleanup lock in bufmgr.c. I'm tempted to remove that altogether, and
use the new signal multiplexing for that too, but OTOH if it works,
maybe I shouldn't touch it.

Attached is a patch with some minor changes I've made. Mostly cosmetic,
but I did modify the sinval code so that ProcState has PGPROC pointer
instead of backend pid, so that we don't need to search the ProcArray to
find the PGPROC struct of the backend we're signaling.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
*** a/src/backend/access/transam/twophase.c
--- b/src/backend/access/transam/twophase.c
***************
*** 287,292 **** MarkAsPreparing(TransactionId xid, const char *gid,
--- 287,293 ----
      gxact->proc.databaseId = databaseid;
      gxact->proc.roleId = owner;
      gxact->proc.inCommit = false;
+     gxact->proc.signalFlags = 0;
      gxact->proc.vacuumFlags = 0;
      gxact->proc.lwWaiting = false;
      gxact->proc.lwExclusive = false;
*** a/src/backend/commands/async.c
--- b/src/backend/commands/async.c
***************
*** 915,923 **** EnableNotifyInterrupt(void)
   *        a frontend command.  Signal handler execution of inbound notifies
   *        is disabled until the next EnableNotifyInterrupt call.
   *
!  *        The SIGUSR1 signal handler also needs to call this, so as to
!  *        prevent conflicts if one signal interrupts the other.  So we
!  *        must return the previous state of the flag.
   */
  bool
  DisableNotifyInterrupt(void)
--- 915,924 ----
   *        a frontend command.  Signal handler execution of inbound notifies
   *        is disabled until the next EnableNotifyInterrupt call.
   *
!  *        This also needs to be called when SIGUSR1 with
!  *        PROCSIG_CATCHUP_INTERRUPT is received, so as to prevent conflicts
!  *        if one signal interrupts the other.  So we must return the previous
!  *        state of the flag.
   */
  bool
  DisableNotifyInterrupt(void)
***************
*** 954,960 **** ProcessIncomingNotify(void)
                  nulls[Natts_pg_listener];
      bool        catchup_enabled;

!     /* Must prevent SIGUSR1 interrupt while I am running */
      catchup_enabled = DisableCatchupInterrupt();

      if (Trace_notify)
--- 955,961 ----
                  nulls[Natts_pg_listener];
      bool        catchup_enabled;

!     /* Must prevent catchup interrupt while I am running */
      catchup_enabled = DisableCatchupInterrupt();

      if (Trace_notify)
*** a/src/backend/postmaster/autovacuum.c
--- b/src/backend/postmaster/autovacuum.c
***************
*** 1477,1483 **** AutoVacWorkerMain(int argc, char *argv[])
      pqsignal(SIGALRM, handle_sig_alarm);

      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, CatchupInterruptHandler);
      /* We don't listen for async notifies */
      pqsignal(SIGUSR2, SIG_IGN);
      pqsignal(SIGFPE, FloatExceptionHandler);
--- 1477,1483 ----
      pqsignal(SIGALRM, handle_sig_alarm);

      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, proc_sigusr1_handler);
      /* We don't listen for async notifies */
      pqsignal(SIGUSR2, SIG_IGN);
      pqsignal(SIGFPE, FloatExceptionHandler);
*** a/src/backend/storage/ipc/sinval.c
--- b/src/backend/storage/ipc/sinval.c
***************
*** 27,33 ****
   * need a way to give an idle backend a swift kick in the rear and make
   * it catch up before the sinval queue overflows and forces it to go
   * through a cache reset exercise.    This is done by sending SIGUSR1
!  * to any backend that gets too far behind.
   *
   * State for catchup events consists of two flags: one saying whether
   * the signal handler is currently allowed to call ProcessCatchupEvent
--- 27,34 ----
   * need a way to give an idle backend a swift kick in the rear and make
   * it catch up before the sinval queue overflows and forces it to go
   * through a cache reset exercise.    This is done by sending SIGUSR1
!  * with PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far
!  * behind.
   *
   * State for catchup events consists of two flags: one saying whether
   * the signal handler is currently allowed to call ProcessCatchupEvent
***************
*** 144,152 **** ReceiveSharedInvalidMessages(


  /*
!  * CatchupInterruptHandler
   *
!  * This is the signal handler for SIGUSR1.
   *
   * If we are idle (catchupInterruptEnabled is set), we can safely
   * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
--- 145,154 ----


  /*
!  * HandleCatchupInterrupt
   *
!  * This is called when SIGUSR1 with PROCSIG_CATCHUP_INTERRUPT is
!  * received.
   *
   * If we are idle (catchupInterruptEnabled is set), we can safely
   * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
***************
*** 156,168 **** ReceiveSharedInvalidMessages(
   * since there's no longer any reason to do anything.)
   */
  void
! CatchupInterruptHandler(SIGNAL_ARGS)
  {
-     int            save_errno = errno;
-
      /*
!      * Note: this is a SIGNAL HANDLER.    You must be very wary what you do
!      * here.
       */

      /* Don't joggle the elbow of proc_exit */
--- 158,168 ----
   * since there's no longer any reason to do anything.)
   */
  void
! HandleCatchupInterrupt(void)
  {
      /*
!      * Note: this is called by a SIGNAL HANDLER.
!      * You must be very wary what you do here.
       */

      /* Don't joggle the elbow of proc_exit */
***************
*** 216,223 **** CatchupInterruptHandler(SIGNAL_ARGS)
           */
          catchupInterruptOccurred = 1;
      }
-
-     errno = save_errno;
  }

  /*
--- 216,221 ----
***************
*** 289,295 **** DisableCatchupInterrupt(void)
  /*
   * ProcessCatchupEvent
   *
!  * Respond to a catchup event (SIGUSR1) from another backend.
   *
   * This is called either directly from the SIGUSR1 signal handler,
   * or the next time control reaches the outer idle loop (assuming
--- 287,294 ----
  /*
   * ProcessCatchupEvent
   *
!  * Respond to a catchup event (SIGUSR1 with PROCSIG_CATCHUP_INTERRUPT)
!  * from another backend.
   *
   * This is called either directly from the SIGUSR1 signal handler,
   * or the next time control reaches the outer idle loop (assuming
*** a/src/backend/storage/ipc/sinvaladt.c
--- b/src/backend/storage/ipc/sinvaladt.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "storage/backendid.h"
  #include "storage/ipc.h"
  #include "storage/proc.h"
+ #include "storage/procarray.h"
  #include "storage/shmem.h"
  #include "storage/sinvaladt.h"
  #include "storage/spin.h"
***************
*** 136,144 ****
  /* Per-backend state in shared invalidation structure */
  typedef struct ProcState
  {
!     /* procPid is zero in an inactive ProcState array entry. */
!     pid_t        procPid;        /* PID of backend, for signaling */
!     /* nextMsgNum is meaningless if procPid == 0 or resetState is true. */
      int            nextMsgNum;        /* next message number to read */
      bool        resetState;        /* backend needs to reset its state */
      bool        signaled;        /* backend has been sent catchup signal */
--- 137,145 ----
  /* Per-backend state in shared invalidation structure */
  typedef struct ProcState
  {
!     /* proc is NULL in an inactive ProcState array entry. */
!     PGPROC       *proc;            /* PGPROC entry of backend, for signaling */
!     /* nextMsgNum is meaningless if proc == NULL or resetState is true. */
      int            nextMsgNum;        /* next message number to read */
      bool        resetState;        /* backend needs to reset its state */
      bool        signaled;        /* backend has been sent catchup signal */
***************
*** 235,241 **** CreateSharedInvalidationState(void)
      /* Mark all backends inactive, and initialize nextLXID */
      for (i = 0; i < shmInvalBuffer->maxBackends; i++)
      {
!         shmInvalBuffer->procState[i].procPid = 0;            /* inactive */
          shmInvalBuffer->procState[i].nextMsgNum = 0;        /* meaningless */
          shmInvalBuffer->procState[i].resetState = false;
          shmInvalBuffer->procState[i].signaled = false;
--- 236,242 ----
      /* Mark all backends inactive, and initialize nextLXID */
      for (i = 0; i < shmInvalBuffer->maxBackends; i++)
      {
!         shmInvalBuffer->procState[i].proc = NULL;            /* inactive */
          shmInvalBuffer->procState[i].nextMsgNum = 0;        /* meaningless */
          shmInvalBuffer->procState[i].resetState = false;
          shmInvalBuffer->procState[i].signaled = false;
***************
*** 266,272 **** SharedInvalBackendInit(void)
      /* Look for a free entry in the procState array */
      for (index = 0; index < segP->lastBackend; index++)
      {
!         if (segP->procState[index].procPid == 0)        /* inactive slot? */
          {
              stateP = &segP->procState[index];
              break;
--- 267,273 ----
      /* Look for a free entry in the procState array */
      for (index = 0; index < segP->lastBackend; index++)
      {
!         if (segP->procState[index].proc == NULL)        /* inactive slot? */
          {
              stateP = &segP->procState[index];
              break;
***************
*** 278,284 **** SharedInvalBackendInit(void)
          if (segP->lastBackend < segP->maxBackends)
          {
              stateP = &segP->procState[segP->lastBackend];
!             Assert(stateP->procPid == 0);
              segP->lastBackend++;
          }
          else
--- 279,285 ----
          if (segP->lastBackend < segP->maxBackends)
          {
              stateP = &segP->procState[segP->lastBackend];
!             Assert(stateP->proc == NULL);
              segP->lastBackend++;
          }
          else
***************
*** 303,309 **** SharedInvalBackendInit(void)
      nextLocalTransactionId = stateP->nextLXID;

      /* mark myself active, with all extant messages already read */
!     stateP->procPid = MyProcPid;
      stateP->nextMsgNum = segP->maxMsgNum;
      stateP->resetState = false;
      stateP->signaled = false;
--- 304,310 ----
      nextLocalTransactionId = stateP->nextLXID;

      /* mark myself active, with all extant messages already read */
!     stateP->proc = MyProc;
      stateP->nextMsgNum = segP->maxMsgNum;
      stateP->resetState = false;
      stateP->signaled = false;
***************
*** 341,347 **** CleanupInvalidationState(int status, Datum arg)
      stateP->nextLXID = nextLocalTransactionId;

      /* Mark myself inactive */
!     stateP->procPid = 0;
      stateP->nextMsgNum = 0;
      stateP->resetState = false;
      stateP->signaled = false;
--- 342,348 ----
      stateP->nextLXID = nextLocalTransactionId;

      /* Mark myself inactive */
!     stateP->proc = NULL;
      stateP->nextMsgNum = 0;
      stateP->resetState = false;
      stateP->signaled = false;
***************
*** 349,355 **** CleanupInvalidationState(int status, Datum arg)
      /* Recompute index of last active backend */
      for (i = segP->lastBackend; i > 0; i--)
      {
!         if (segP->procState[i - 1].procPid != 0)
              break;
      }
      segP->lastBackend = i;
--- 350,356 ----
      /* Recompute index of last active backend */
      for (i = segP->lastBackend; i > 0; i--)
      {
!         if (segP->procState[i - 1].proc != NULL)
              break;
      }
      segP->lastBackend = i;
***************
*** 374,380 **** BackendIdIsActive(int backendID)
      {
          ProcState  *stateP = &segP->procState[backendID - 1];

!         result = (stateP->procPid != 0);
      }
      else
          result = false;
--- 375,381 ----
      {
          ProcState  *stateP = &segP->procState[backendID - 1];

!         result = (stateP->proc != NULL);
      }
      else
          result = false;
***************
*** 590,596 **** SICleanupQueue(bool callerHasWriteLock, int minFree)
          int        n = stateP->nextMsgNum;

          /* Ignore if inactive or already in reset state */
!         if (stateP->procPid == 0 || stateP->resetState)
              continue;

          /*
--- 591,597 ----
          int        n = stateP->nextMsgNum;

          /* Ignore if inactive or already in reset state */
!         if (stateP->proc == NULL || stateP->resetState)
              continue;

          /*
***************
*** 644,661 **** SICleanupQueue(bool callerHasWriteLock, int minFree)
          segP->nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM;

      /*
!      * Lastly, signal anyone who needs a catchup interrupt.  Since kill()
!      * might not be fast, we don't want to hold locks while executing it.
       */
      if (needSig)
      {
!         pid_t    his_pid = needSig->procPid;

          needSig->signaled = true;
          LWLockRelease(SInvalReadLock);
          LWLockRelease(SInvalWriteLock);
!         elog(DEBUG4, "sending sinval catchup signal to PID %d", (int) his_pid);
!         kill(his_pid, SIGUSR1);
          if (callerHasWriteLock)
              LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
      }
--- 645,664 ----
          segP->nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM;

      /*
!      * Lastly, signal anyone who needs a catchup interrupt.  Since
!      * SendProcSignal() might not be fast, we don't want to hold locks while
!      * executing it.
       */
      if (needSig)
      {
!         PGPROC *his_proc = needSig->proc;

          needSig->signaled = true;
          LWLockRelease(SInvalReadLock);
          LWLockRelease(SInvalWriteLock);
!         elog(DEBUG4, "sending sinval catchup signal to PID %d",
!              (int) his_proc->pid);
!         SendProcSignal(his_proc, PROCSIG_CATCHUP_INTERRUPT);
          if (callerHasWriteLock)
              LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
      }
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
***************
*** 289,294 **** InitProcess(void)
--- 289,295 ----
      MyProc->databaseId = InvalidOid;
      MyProc->roleId = InvalidOid;
      MyProc->inCommit = false;
+     MyProc->signalFlags = 0;
      MyProc->vacuumFlags = 0;
      if (IsAutoVacuumWorkerProcess())
          MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
***************
*** 428,433 **** InitAuxiliaryProcess(void)
--- 429,435 ----
      MyProc->databaseId = InvalidOid;
      MyProc->roleId = InvalidOid;
      MyProc->inCommit = false;
+     MyProc->signalFlags = 0;
      /* we don't set the "is autovacuum" flag in the launcher */
      MyProc->vacuumFlags = 0;
      MyProc->lwWaiting = false;
***************
*** 1277,1282 **** ProcSendSignal(int pid)
--- 1279,1330 ----
          PGSemaphoreUnlock(&proc->sem);
  }

+ /*
+  * SendProcSignal - send the signal with the reason to the process
+  * (such as backend, autovacuum worker and auxiliary process)
+  * identified by proc.
+  */
+ void
+ SendProcSignal(PGPROC *proc, uint8 reason)
+ {
+     int pid;
+
+     if (proc == NULL)
+         return;
+
+     LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+     pid = proc->pid;
+     if (pid != 0)
+         proc->signalFlags |= reason;
+     LWLockRelease(ProcArrayLock);
+
+     /* Send SIGUSR1 to the process */
+     kill(pid, SIGUSR1);
+ }
+
+ /*
+  * CheckProcSignal - check to see if the particular reason has been
+  * signaled, and clear the signal flag.  Should be called after
+  * receiving SIGUSR1.
+  */
+ bool
+ CheckProcSignal(uint8 reason)
+ {
+     LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+
+     /* Careful here --- don't clear flag if we haven't seen it set */
+     if (MyProc->signalFlags & reason)
+     {
+         MyProc->signalFlags &= ~reason;
+         LWLockRelease(ProcArrayLock);
+         return true;
+     }
+
+     LWLockRelease(ProcArrayLock);
+
+     return false;
+ }
+

  /*****************************************************************************
   * SIGALRM interrupt support
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 2437,2442 **** drop_unnamed_stmt(void)
--- 2437,2464 ----
   */

  /*
+  * proc_sigusr1_handler - handle SIGUSR1 signal.
+  *
+  * SIGUSR1 is multiplexed to handle multiple different events. The signalFlags
+  * bitmask in PGPROC indicates which events have been signaled.
+  */
+ void
+ proc_sigusr1_handler(SIGNAL_ARGS)
+ {
+     int            save_errno = errno;
+
+     if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT))
+     {
+         /*
+          * Catchup interrupt has been sent.
+          */
+         HandleCatchupInterrupt();
+     }
+
+     errno = save_errno;
+ }
+
+ /*
   * quickdie() occurs when signalled SIGQUIT by the postmaster.
   *
   * Some backend has bought the farm,
***************
*** 3180,3186 **** PostgresMain(int argc, char *argv[], const char *username)
       * of output during who-knows-what operation...
       */
      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, CatchupInterruptHandler);
      pqsignal(SIGUSR2, NotifyInterruptHandler);
      pqsignal(SIGFPE, FloatExceptionHandler);

--- 3202,3208 ----
       * of output during who-knows-what operation...
       */
      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, proc_sigusr1_handler);
      pqsignal(SIGUSR2, NotifyInterruptHandler);
      pqsignal(SIGFPE, FloatExceptionHandler);

*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
***************
*** 38,43 **** struct XidCache
--- 38,46 ----
      TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS];
  };

+ /* Signals for PGPROC->signalFlags */
+ #define        PROCSIG_CATCHUP_INTERRUPT    0x01    /* catchup interrupt */
+
  /* Flags for PGPROC->vacuumFlags */
  #define        PROC_IS_AUTOVACUUM    0x01    /* is it an autovac worker? */
  #define        PROC_IN_VACUUM        0x02    /* currently running lazy vacuum */
***************
*** 91,96 **** struct PGPROC
--- 94,100 ----

      bool        inCommit;        /* true if within commit critical section */

+     uint8        signalFlags;    /* bitmask of signals raised, see above */
      uint8        vacuumFlags;    /* vacuum-related flags, see above */

      /* Info about LWLock the process is currently waiting for, if any. */
***************
*** 171,176 **** extern void LockWaitCancel(void);
--- 175,183 ----
  extern void ProcWaitForSignal(void);
  extern void ProcSendSignal(int pid);

+ extern void SendProcSignal(PGPROC *proc, uint8 reason);
+ extern bool CheckProcSignal(uint8 reason);
+
  extern bool enable_sig_alarm(int delayms, bool is_statement_timeout);
  extern bool disable_sig_alarm(bool is_statement_timeout);
  extern void handle_sig_alarm(SIGNAL_ARGS);
*** a/src/include/storage/sinval.h
--- b/src/include/storage/sinval.h
***************
*** 90,96 **** extern void ReceiveSharedInvalidMessages(
                               void (*resetFunction) (void));

  /* signal handler for catchup events (SIGUSR1) */
! extern void CatchupInterruptHandler(SIGNAL_ARGS);

  /*
   * enable/disable processing of catchup events directly from signal handler.
--- 90,96 ----
                               void (*resetFunction) (void));

  /* signal handler for catchup events (SIGUSR1) */
! extern void HandleCatchupInterrupt(void);

  /*
   * enable/disable processing of catchup events directly from signal handler.
*** a/src/include/tcop/tcopprot.h
--- b/src/include/tcop/tcopprot.h
***************
*** 56,61 **** extern List *pg_plan_queries(List *querytrees, int cursorOptions,
--- 56,62 ----

  extern bool assign_max_stack_depth(int newval, bool doit, GucSource source);

+ extern void proc_sigusr1_handler(SIGNAL_ARGS);
  extern void die(SIGNAL_ARGS);
  extern void quickdie(SIGNAL_ARGS);
  extern void authdie(SIGNAL_ARGS);

Re: Multiplexing SUGUSR1

From
Greg Stark
Date:
Does this signal multiplexing solve the "out of signals" problem we  
have generally? I need another signal for the progress indicator too.  
Or is this only useful for other users which need the same locks or  
other resources?

-- 
Greg


On 8 Dec 2008, at 08:04, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com > wrote:

> I've been looking at the signal handling part of the synchronous  
> replication patch. It looks OK, but one thing makes me worry.
>
> To set or clear the flag from PGPROC, to send or handle a signal, we  
> have to acquire ProcArrayLock. Is that safe to do in a signal  
> handler? And is the performance impact of that acceptable?
>
>
> Another observation is that the patch introduces a new function  
> called SendProcSignal. Nothing wrong with that, except that there's  
> an existing function called ProcSendSignal, just above  
> SendProcSignal, so there's some potential for confusion. The old  
> ProcSendSignal function uses the per-backend semaphore to wake up a  
> backend. It's only used to wait for the cleanup lock in bufmgr.c.  
> I'm tempted to remove that altogether, and use the new signal  
> multiplexing for that too, but OTOH if it works, maybe I shouldn't  
> touch it.
>
> Attached is a patch with some minor changes I've made. Mostly  
> cosmetic, but I did modify the sinval code so that ProcState has  
> PGPROC pointer instead of backend pid, so that we don't need to  
> search the ProcArray to find the PGPROC struct of the backend we're  
> signaling.
>
> -- 
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
> *** a/src/backend/access/transam/twophase.c --- b/src/backend/access/ 
> transam/twophase.c *************** *** 287,292 ****  
> MarkAsPreparing(TransactionId xid, const char *gid, --- 287,293 ----  
> gxact->proc.databaseId = databaseid; gxact->proc.roleId = owner;  
> gxact->proc.inCommit = false; + gxact->proc.signalFlags = 0;        
> gxact->proc.vacuumFlags = 0; gxact->proc.lwWaiting = false; gxact- 
> >proc.lwExclusive = false; *** a/src/backend/commands/async.c --- b/ 
> src/backend/commands/async.c *************** *** 915,923 ****  
> EnableNotifyInterrupt(void) *     a frontend command. Signal handler  
> execution of inbound notifies *     is disabled until the next  
> EnableNotifyInterrupt call. * ! *     The SIGUSR1 signal handler also  
> needs to call this, so as to ! *     prevent conflicts if one signal  
> interrupts the other. So we ! *     must return the previous state of  
> the flag. */ bool DisableNotifyInterrupt(void) --- 915,924 ---- *     a  
> frontend command. Signal handler execution of inbound notifies *     is  
> disabled until the next EnableNotifyInterrupt call. * ! *     This also  
> needs to be called when SIGUSR1 with ! * PROCSIG_CATCHUP_INTERRUPT  
> is received, so as to prevent conflicts  ! *     if one signal  
> interrupts the other. So we must return the previous ! * state of  
> the flag. */ bool DisableNotifyInterrupt(void) *************** ***  
> 954,960 **** ProcessIncomingNotify(void) nulls[Natts_pg_listener];  
> bool catchup_enabled; ! /* Must prevent SIGUSR1 interrupt while I am  
> running */ catchup_enabled = DisableCatchupInterrupt(); if  
> (Trace_notify) --- 955,961 ---- nulls[Natts_pg_listener]; bool  
> catchup_enabled; ! /* Must prevent catchup interrupt while I am  
> running */ catchup_enabled = DisableCatchupInterrupt(); if  
> (Trace_notify) *** a/src/backend/postmaster/autovacuum.c --- b/src/ 
> backend/postmaster/autovacuum.c *************** *** 1477,1483 ****  
> AutoVacWorkerMain(int argc, char *argv[]) pqsignal(SIGALRM,  
> handle_sig_alarm); pqsignal(SIGPIPE, SIG_IGN); ! pqsignal(SIGUSR1,  
> CatchupInterruptHandler); /* We don't listen for async notifies */  
> pqsignal(SIGUSR2, SIG_IGN); pqsignal(SIGFPE, FloatExceptionHandler);  
> --- 1477,1483 ---- pqsignal(SIGALRM, handle_sig_alarm);  
> pqsignal(SIGPIPE, SIG_IGN); ! pqsignal(SIGUSR1,  
> proc_sigusr1_handler); /* We don't listen for async notifies */  
> pqsignal(SIGUSR2, SIG_IGN); pqsignal(SIGFPE, FloatExceptionHandler);  
> *** a/src/backend/storage/ipc/sinval.c --- b/src/backend/storage/ipc/ 
> sinval.c *************** *** 27,33 **** * need a way to give an idle  
> backend a swift kick in the rear and make * it catch up before the  
> sinval queue overflows and forces it to go * through a cache reset  
> exercise.    This is done by sending SIGUSR1 ! * to any backend that  
> gets too far behind. * * State for catchup events consists of two  
> flags: one saying whether * the signal handler is currently allowed  
> to call ProcessCatchupEvent --- 27,34 ---- * need a way to give an  
> idle backend a swift kick in the rear and make * it catch up before  
> the sinval queue overflows and forces it to go * through a cache  
> reset exercise.    This is done by sending SIGUSR1 ! * with  
> PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far ! *  
> behind. * * State for catchup events consists of two flags: one  
> saying whether * the signal handler is currently allowed to call  
> ProcessCatchupEvent *************** *** 144,152 ****  
> ReceiveSharedInvalidMessages( /* !  * CatchupInterruptHandler * ! *  
> This is the signal handler for SIGUSR1. *    * If we are idle  
> (catchupInterruptEnabled is set), we can safely * invoke  
> ProcessCatchupEvent directly. Otherwise, just set a flag --- 145,154  
> ---- /* ! * HandleCatchupInterrupt * ! * This is called when SIGUSR1  
> with PROCSIG_CATCHUP_INTERRUPT is ! * received. * * If we are idle  
> (catchupInterruptEnabled is set), we can safely * invoke  
> ProcessCatchupEvent directly. Otherwise, just set a flag  
> *************** *** 156,168 **** ReceiveSharedInvalidMessages( *  
> since there's no longer any reason to do anything.) */ void !  
> CatchupInterruptHandler(SIGNAL_ARGS) { - int     save_errno = errno; - / 
> * !      * Note: this is a SIGNAL HANDLER.    You must be very wary what  
> you do !      * here. */ /* Don't joggle the elbow of proc_exit */ ---  
> 158,168 ---- * since there's no longer any reason to do anything.)  
> */ void ! HandleCatchupInterrupt(void) { /* ! * Note: this is called  
> by a SIGNAL HANDLER.     ! * You must be very wary what you do here.  
> */ /* Don't joggle the elbow of proc_exit */ *************** ***  
> 216,223 **** CatchupInterruptHandler(SIGNAL_ARGS) */  
> catchupInterruptOccurred = 1; } - - errno = save_errno; } /* ---  
> 216,221 ---- *************** *** 289,295 ****  
> DisableCatchupInterrupt(void) /* * ProcessCatchupEvent * ! * Respond  
> to a catchup event (SIGUSR1) from another backend. * * This is  
> called either directly from the SIGUSR1 signal handler, * or the  
> next time control reaches the outer idle loop (assuming --- 287,294  
> ---- /* * ProcessCatchupEvent * ! * Respond to a catchup event  
> (SIGUSR1 with PROCSIG_CATCHUP_INTERRUPT) ! * from another backend. *  
> * This is called either directly from the SIGUSR1 signal handler, *  
> or the next time control reaches the outer idle loop (assuming *** a/ 
> src/backend/storage/ipc/sinvaladt.c --- b/src/backend/storage/ipc/ 
> sinvaladt.c *************** *** 21,26 **** --- 21,27 ---- #include  
> "storage/backendid.h" #include "storage/ipc.h" #include "storage/ 
> proc.h" + #include "storage/procarray.h" #include "storage/shmem.h"  
> #include "storage/sinvaladt.h" #include "storage/spin.h"  
> *************** *** 136,144 **** /* Per-backend state in shared  
> invalidation structure */ typedef struct ProcState { ! /* procPid is  
> zero in an inactive ProcState array entry. */ ! pid_t     procPid;     /*  
> PID of backend, for signaling */ ! /* nextMsgNum is meaningless if  
> procPid == 0 or resetState is true. */       int     nextMsgNum;     /* next  
> message number to read */ bool     resetState;     /* backend needs to  
> reset its state */ bool     signaled;     /* backend has been sent catchup  
> signal */ --- 137,145 ---- /* Per-backend state in shared  
> invalidation structure */ typedef struct ProcState { ! /* proc is  
> NULL in an inactive ProcState array entry. */ ! PGPROC     *proc;     /*  
> PGPROC entry of backend, for signaling */ ! /* nextMsgNum is
> meaningless if proc == NULL or resetState is true. */ int  
> nextMsgNum;     /* next message number to read */ bool     resetState;     /*  
> backend needs to reset its state */ bool     signaled;     /* backend has  
> been sent catchup signal */ *************** *** 235,241 ****  
> CreateSharedInvalidationState(void)       /* Mark all backends  
> inactive, and initialize nextLXID */ for (i = 0; i < shmInvalBuffer- 
> >maxBackends; i++)       { ! shmInvalBuffer->procState[i].procPid =  
> 0;     /* inactive */ shmInvalBuffer->procState[i].nextMsgNum = 0;     /*  
> meaningless */ shmInvalBuffer->procState[i].resetState = false;  
> shmInvalBuffer->procState[i].signaled = false; --- 236,242 ---- /*  
> Mark all backends inactive, and initialize nextLXID */ for (i = 0; i  
> < shmInvalBuffer->maxBackends; i++)       { ! shmInvalBuffer- 
> >procState[i].proc = NULL;     /* inactive */ shmInvalBuffer- 
> >procState[i].nextMsgNum = 0;     /* meaningless */ shmInvalBuffer- 
> >procState[i].resetState = false; shmInvalBuffer- 
> >procState[i].signaled = false; *************** *** 266,272 ****  
> SharedInvalBackendInit(void) /* Look for a free entry in the  
> procState array */ for (index = 0; index < segP->lastBackend; index+ 
> +) { ! if (segP->procState[index].procPid == 0)     /* inactive slot?  
> */ { stateP = &segP->procState[index]; break; --- 267,273 ---- /*  
> Look for a free entry in the procState array */ for (index = 0;  
> index < segP->lastBackend; index++) { ! if (segP- 
> >procState[index].proc == NULL)     /* inactive slot? */ { stateP =  
> &segP->procState[index]; break; *************** *** 278,284 ****  
> SharedInvalBackendInit(void) if (segP->lastBackend < segP- 
> >maxBackends) { stateP = &segP->procState[segP->lastBackend]; !  
> Assert(stateP->procPid == 0); segP->lastBackend++; } else ---  
> 279,285 ---- if (segP->lastBackend < segP->maxBackends) { stateP =  
> &segP->procState[segP->lastBackend]; ! Assert(stateP->proc == NULL);  
> segP->lastBackend++; } else *************** *** 303,309 ****  
> SharedInvalBackendInit(void) nextLocalTransactionId = stateP- 
> >nextLXID; /* mark myself active, with all extant messages already  
> read */ ! stateP->procPid = MyProcPid; stateP->nextMsgNum = segP- 
> >maxMsgNum; stateP->resetState = false; stateP->signaled = false;  
> --- 304,310 ---- nextLocalTransactionId = stateP->nextLXID; /* mark  
> myself active, with all extant messages already read */ ! stateP- 
> >proc = MyProc; stateP->nextMsgNum = segP->maxMsgNum; stateP- 
> >resetState = false; stateP->signaled = false; *************** ***  
> 341,347 **** CleanupInvalidationState(int status, Datum arg) stateP- 
> >nextLXID = nextLocalTransactionId; /* Mark myself inactive */ !  
> stateP->procPid = 0; stateP->nextMsgNum = 0; stateP->resetState =  
> false; stateP->signaled = false; --- 342,348 ---- stateP->nextLXID =  
> nextLocalTransactionId;          /* Mark myself inactive */ ! stateP- 
> >proc = NULL; stateP->nextMsgNum = 0; stateP->resetState = false;  
> stateP->signaled = false; *************** *** 349,355 ****  
> CleanupInvalidationState(int status, Datum arg) /* Recompute index  
> of last active backend */ for (i = segP->lastBackend; i > 0; i--)  
> { ! if (segP->procState[i - 1].procPid != 0) break;       } segP- 
> >lastBackend = i; --- 350,356 ---- /* Recompute index of last active  
> backend */ for (i = segP->lastBackend; i > 0; i--) { ! if (segP- 
> >procState[i - 1].proc != NULL) break; } segP->lastBackend = i;  
> *************** *** 374,380 **** BackendIdIsActive(int backendID)  
> { ProcState *stateP = &segP->procState[backendID - 1]; ! result =  
> (stateP->procPid != 0); } else result = false; --- 375,381 ----  
> { ProcState *stateP = &segP->procState[backendID - 1]; ! result =  
> (stateP->proc != NULL); } else result = false; *************** ***  
> 590,596 **** SICleanupQueue(bool callerHasWriteLock, int minFree)  
> int n = stateP->nextMsgNum; /* Ignore if inactive or already in  
> reset state */ ! if (stateP->procPid == 0 || stateP->resetState)  
> continue; /* --- 591,597 ---- int     n = stateP->nextMsgNum; /* Ignore  
> if inactive or already in reset state */ ! if (stateP->proc == NULL  
> || stateP->resetState) continue; /* *************** *** 644,661 ****  
> SICleanupQueue(bool callerHasWriteLock, int minFree) segP- 
> >nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM; / 
> * ! * Lastly, signal anyone who needs a catchup interrupt. Since  
> kill() ! * might not be fast, we don't want to hold locks while  
> executing it. */ if (needSig) { ! pid_t    his_pid = needSig->procPid;  
> needSig->signaled = true; LWLockRelease(SInvalReadLock);  
> LWLockRelease(SInvalWriteLock); ! elog(DEBUG4, "sending sinval  
> catchup signal to PID %d", (int) his_pid); ! kill(his_pid, SIGUSR1);  
> if (callerHasWriteLock) LWLockAcquire(SInvalWriteLock,  
> LW_EXCLUSIVE); } --- 645,664 ---- segP->nextThreshold = (numMsgs /  
> CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM; /* ! * Lastly, signal anyone  
> who needs a catchup interrupt. Since ! * SendProcSignal() might not  
> be fast, we don't want to hold locks while ! * executing it. */ if  
> (needSig) { ! PGPROC *his_proc = needSig->proc; needSig->signaled =  
> true; LWLockRelease(SInvalReadLock);  
> LWLockRelease(SInvalWriteLock); ! elog(DEBUG4, "sending sinval  
> catchup signal to PID %d", ! (int) his_proc->pid); !  
> SendProcSignal(his_proc, PROCSIG_CATCHUP_INTERRUPT); if  
> (callerHasWriteLock) LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE); }  
> *** a/src/backend/storage/lmgr/proc.c --- b/src/backend/storage/lmgr/ 
> proc.c *************** *** 289,294 **** InitProcess(void) ---  
> 289,295 ---- MyProc->databaseId = InvalidOid; MyProc->roleId =  
> InvalidOid; MyProc->inCommit = false; + MyProc->signalFlags = 0;  
> MyProc->vacuumFlags = 0; if (IsAutoVacuumWorkerProcess()) MyProc- 
> >vacuumFlags |= PROC_IS_AUTOVACUUM; *************** *** 428,433 ****  
> InitAuxiliaryProcess(void) --- 429,435 ---- MyProc->databaseId =  
> InvalidOid; MyProc->roleId = InvalidOid; MyProc->inCommit = false; +  
> MyProc->signalFlags = 0; /* we don't set the "is autovacuum" flag in  
> the launcher */ MyProc->vacuumFlags = 0; MyProc->lwWaiting = false;  
> *************** *** 1277,1282 **** ProcSendSignal(int pid) ---  
> 1279,1330 ---- PGSemaphoreUnlock(&proc->sem); } + /* + *  
> SendProcSignal - send the signal with the reason to the process + *  
> (such as backend, autovacuum worker and auxiliary process) + *  
> identified by proc. + */ + void + SendProcSignal(PGPROC *proc, uint8  
> reason) + { + int pid; + + if (proc == NULL) + return; + +  
> LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + pid = proc->pid; + if  
> (pid != 0) + proc->signalFlags |= reason; +  
> LWLockRelease(ProcArrayLock); + +     /* Send SIGUSR1 to the process */  
> +     kill(pid, SIGUSR1); + } + + /* + * CheckProcSignal - check to see  
> if the particular reason has been + * signaled, and clear the signal  
> flag. Should be called after + * receiving SIGUSR1. + */ + bool +  
> CheckProcSignal(uint8 reason) + { +     LWLockAcquire(ProcArrayLock,  
> LW_EXCLUSIVE); + + /* Careful here --- don't clear flag if we  
> haven't seen it set */ + if (MyProc->signalFlags & reason) + { +  
> MyProc->signalFlags &= ~reason; + LWLockRelease(ProcArrayLock); +  
> return true; + } + + LWLockRelease(ProcArrayLock); + +     return  
> false; + } + / 
> *** 
> *** 
> *** 
> ********************************************************************  
> * SIGALRM interrupt support *** a/src/backend/tcop/postgres.c --- b/ 
> src/backend/tcop/postgres.c *************** *** 2437,2442 ****  
> drop_unnamed_stmt(void) --- 2437,2464 ---- */ /* + *  
> proc_sigusr1_handler - handle SIGUSR1 signal. + * + * SIGUSR1 is  
> multiplexed to handle multiple different events. The signalFlags + *  
> bitmask in PGPROC indicates which events have been signaled. + */ +  
> void + proc_sigusr1_handler(SIGNAL_ARGS) + { + int     save_errno =  
> errno; + + if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT)) + { + /*  
> + * Catchup interrupt has been sent. + */ +
> HandleCatchupInterrupt(); + } + + errno = save_errno; + } + + /* *  
> quickdie() occurs when signalled SIGQUIT by the postmaster. * * Some  
> backend has bought the farm, *************** *** 3180,3186 ****  
> PostgresMain(int argc, char *argv[], const char *username) * of  
> output during who-knows-what operation... */ pqsignal(SIGPIPE,  
> SIG_IGN); ! pqsignal(SIGUSR1, CatchupInterruptHandler);  
> pqsignal(SIGUSR2, NotifyInterruptHandler); pqsignal(SIGFPE,  
> FloatExceptionHandler); --- 3202,3208 ---- * of output during who- 
> knows-what operation... */ pqsignal(SIGPIPE, SIG_IGN); !  
> pqsignal(SIGUSR1, proc_sigusr1_handler); pqsignal(SIGUSR2,  
> NotifyInterruptHandler); pqsignal(SIGFPE, FloatExceptionHandler);  
> *** a/src/include/storage/proc.h --- b/src/include/storage/proc.h  
> *************** *** 38,43 **** struct XidCache --- 38,46 ----  
> TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS]; }; + /* Signals for  
> PGPROC->signalFlags */ + #define PROCSIG_CATCHUP_INTERRUPT    0x01    /*  
> catchup interrupt */ + /* Flags for PGPROC->vacuumFlags */ #define      
> PROC_IS_AUTOVACUUM    0x01    /* is it an autovac worker? */ #define      
> PROC_IN_VACUUM     0x02    /* currently running lazy vacuum */  
> *************** *** 91,96 **** struct PGPROC --- 94,100 ---- bool  
> inCommit;     /* true if within commit critical section */ + uint8      
> signalFlags;    /* bitmask of signals raised, see above */ uint8      
> vacuumFlags;    /* vacuum-related flags, see above */ /* Info about  
> LWLock the process is currently waiting for, if any. */  
> *************** *** 171,176 **** extern void LockWaitCancel(void);  
> --- 175,183 ---- extern void ProcWaitForSignal(void); extern void  
> ProcSendSignal(int pid); + extern void SendProcSignal(PGPROC *proc,  
> uint8 reason); + extern bool CheckProcSignal(uint8 reason); + extern  
> bool enable_sig_alarm(int delayms, bool is_statement_timeout);    
> extern bool disable_sig_alarm(bool is_statement_timeout); extern  
> void handle_sig_alarm(SIGNAL_ARGS); *** a/src/include/storage/ 
> sinval.h --- b/src/include/storage/sinval.h *************** ***  
> 90,96 **** extern void ReceiveSharedInvalidMessages( void  
> (*resetFunction) (void)); /* signal handler for catchup events  
> (SIGUSR1) */ ! extern void CatchupInterruptHandler(SIGNAL_ARGS); /*  
> * enable/disable processing of catchup events directly from signal  
> handler. --- 90,96 ---- void (*resetFunction) (void)); /* signal  
> handler for catchup events (SIGUSR1) */ ! extern void  
> HandleCatchupInterrupt(void); /* * enable/disable processing of  
> catchup events directly from signal handler. *** a/src/include/tcop/ 
> tcopprot.h --- b/src/include/tcop/tcopprot.h *************** ***  
> 56,61 **** extern List *pg_plan_queries(List *querytrees, int  
> cursorOptions, --- 56,62 ---- extern bool assign_max_stack_depth(int  
> newval, bool doit, GucSource source); + extern void  
> proc_sigusr1_handler(SIGNAL_ARGS); extern void die(SIGNAL_ARGS);    
> extern void quickdie(SIGNAL_ARGS); extern void authdie(SIGNAL_ARGS);
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Greg Stark wrote:
> Does this signal multiplexing solve the "out of signals" problem we have 
> generally?

It's a general solution, but it relies on flags in PGPROC, so it'll only 
work for backends and auxiliary processes that have a PGPROC entry.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> To set or clear the flag from PGPROC, to send or handle a signal, we 
> have to acquire ProcArrayLock. Is that safe to do in a signal handler? 

No.  If it's trying to do that then it's broken.  In fact, if it's
trying to do much of anything beyond setting a "volatile" flag variable
in a signal handler, it's broken --- unless there are special provisions
to limit where the signal trap can occur, which would be pretty much
unacceptable for a multiplexed-signal implementation.
        regards, tom lane


Re: Multiplexing SUGUSR1

From
Simon Riggs
Date:
On Mon, 2008-12-08 at 10:04 +0200, Heikki Linnakangas wrote:
> And is the performance impact of that acceptable?

No, but I think we already agreed to change that to a separate lwlock.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> To set or clear the flag from PGPROC, to send or handle a signal, we 
>> have to acquire ProcArrayLock. Is that safe to do in a signal handler? 
> 
> No.  If it's trying to do that then it's broken.  In fact, if it's
> trying to do much of anything beyond setting a "volatile" flag variable
> in a signal handler, it's broken --- unless there are special provisions
> to limit where the signal trap can occur, which would be pretty much
> unacceptable for a multiplexed-signal implementation.

Ok, I was afraid so.

I think we'll need to replace the proposed bitmask with an array of 
sig_atomic_t flags then, and do without locking.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Mon, Dec 8, 2008 at 11:39 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Tom Lane wrote:
>>
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>>
>>> To set or clear the flag from PGPROC, to send or handle a signal, we have
>>> to acquire ProcArrayLock. Is that safe to do in a signal handler?
>>
>> No.  If it's trying to do that then it's broken.  In fact, if it's
>> trying to do much of anything beyond setting a "volatile" flag variable
>> in a signal handler, it's broken --- unless there are special provisions
>> to limit where the signal trap can occur, which would be pretty much
>> unacceptable for a multiplexed-signal implementation.
>
> Ok, I was afraid so.
>
> I think we'll need to replace the proposed bitmask with an array of
> sig_atomic_t flags then, and do without locking.

Thanks! I updated the patch so (based on signal_handling_v2-heikki-1.patch).

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> Hi,
> 
> On Mon, Dec 8, 2008 at 11:39 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Tom Lane wrote:
>>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>>> To set or clear the flag from PGPROC, to send or handle a signal, we have
>>>> to acquire ProcArrayLock. Is that safe to do in a signal handler?
>>> No.  If it's trying to do that then it's broken.  In fact, if it's
>>> trying to do much of anything beyond setting a "volatile" flag variable
>>> in a signal handler, it's broken --- unless there are special provisions
>>> to limit where the signal trap can occur, which would be pretty much
>>> unacceptable for a multiplexed-signal implementation.
>> Ok, I was afraid so.
>>
>> I think we'll need to replace the proposed bitmask with an array of
>> sig_atomic_t flags then, and do without locking.
> 
> Thanks! I updated the patch so (based on signal_handling_v2-heikki-1.patch).

Thank you. Looks good to me, committed with minor changes.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Thank you. Looks good to me, committed with minor changes.

I don't think this patch is anywhere near ready to apply.  In the first
place, touching the PGPROC like that without any lock seems completely
unsafe --- among other things, you're relying on an undocumented
assumption that the space occupied by a PGPROC struct will never be
recycled for use as anything else.  It might be all right for the
limited purposes at the moment, but if you are advertising this as a
general purpose signal multiplexer then it will very soon not be all
right.  For the same reason, it seems like a bad design to set this up
so that the postmaster can't possibly use the mechanism safely.  (Before
a couple of months ago, this wouldn't even have worked to replace the
existing use of SIGUSR1.)  And in the third place, this doesn't work
unless one has one's hands on the target backend's PGPROC, and not
merely its PID.  I object to the changes in sinvaladt.c altogether,
and note that this decision makes it impossible to fold SIGUSR2 handling
into the multiplex code, which is another simple proof that it fails to
qualify as a general-purpose multiplexer.

I think we need something closer to the postmaster signal multiplexing
mechanism, wherein there is a dedicated shared memory area of static
layout that holds the signaling flags.  And it needs to be driven off
of knowing the target's PID, not anything else.
        regards, tom lane


Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Thank you. Looks good to me, committed with minor changes.
> 
> I don't think this patch is anywhere near ready to apply.

Ok, I'll revert it if you feel that strongly.

>  In the first
> place, touching the PGPROC like that without any lock seems completely
> unsafe --- among other things, you're relying on an undocumented
> assumption that the space occupied by a PGPROC struct will never be
> recycled for use as anything else.

Right, it does depend on that.

>  It might be all right for the
> limited purposes at the moment, but if you are advertising this as a
> general purpose signal multiplexer then it will very soon not be all
> right.  For the same reason, it seems like a bad design to set this up
> so that the postmaster can't possibly use the mechanism safely.  (Before
> a couple of months ago, this wouldn't even have worked to replace the
> existing use of SIGUSR1.)  And in the third place, this doesn't work
> unless one has one's hands on the target backend's PGPROC, and not
> merely its PID.  I object to the changes in sinvaladt.c altogether,
> and note that this decision makes it impossible to fold SIGUSR2 handling
> into the multiplex code, which is another simple proof that it fails to
> qualify as a general-purpose multiplexer.

I'm surprised you feel that way. You suggested earlier 
(http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us) 
that a solution that only works for processes attached to shared memory 
would probably suffice for now.

> I think we need something closer to the postmaster signal multiplexing
> mechanism, wherein there is a dedicated shared memory area of static
> layout that holds the signaling flags. And it needs to be driven off
> of knowing the target's PID, not anything else.

That seems hard, considering that we also want it to work without 
locking. Hmm, I presume we can use spinlocks in a signal handler? 
Perhaps some sort of a hash table protected by a spinlock would work.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
Dimitri Fontaine
Date:
Hi,

I hope I'm not disturbing hackers at work by talking about completely
unrelated things but...

Le mardi 09 décembre 2008, Tom Lane a écrit :
> I think we need something closer to the postmaster signal multiplexing
> mechanism, wherein there is a dedicated shared memory area of static
> layout that holds the signaling flags.  And it needs to be driven off
> of knowing the target's PID, not anything else.

...this makes me recall IMessage Queues from Postgres-R, reworked by Markus to
follow your advices about postmaster and shared memory.
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01420.php

Could it be the implementation we need for multiplexing signals from one
backend some others?

Regards,
--
dim

Re: Multiplexing SUGUSR1

From
Alvaro Herrera
Date:
Dimitri Fontaine escribió:

> Le mardi 09 décembre 2008, Tom Lane a écrit :
> > I think we need something closer to the postmaster signal multiplexing
> > mechanism, wherein there is a dedicated shared memory area of static
> > layout that holds the signaling flags.  And it needs to be driven off
> > of knowing the target's PID, not anything else.
> 
> ...this makes me recall IMessage Queues from Postgres-R, reworked by Markus to 
> follow your advices about postmaster and shared memory.
>   http://archives.postgresql.org/pgsql-hackers/2008-07/msg01420.php
> 
> Could it be the implementation we need for multiplexing signals from one 
> backend some others?

No, the signalling needed here is far simpler than Markus' IMessage
stuff.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Multiplexing SUGUSR1

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> I'm surprised you feel that way. You suggested earlier 
> (http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us) 
> that a solution that only works for processes attached to shared memory 
> would probably suffice for now.

Well, I wasn't complaining about the dependence on being attached to
shared memory.  What I'm complaining about is the dependence on the
rather complex PGPROC data structure.

> That seems hard, considering that we also want it to work without 
> locking. Hmm, I presume we can use spinlocks in a signal handler? 
> Perhaps some sort of a hash table protected by a spinlock would work.

No, locks are right out if the postmaster is supposed to be able to use
it.  What I was thinking of is a simple linear array of PIDs and
sig_atomic_t flags.  The slots could be assigned on the basis of
backendid, but callers trying to send a signal would have to scan the
array looking for the matching PID.  (This doesn't seem outlandishly
expensive considering that one is about to do a kernel call anyway.
You might be able to save a few cycles by having the PID array separate
from the flag array, which should improve the cache friendliness of the
scan.)  Also, for those callers who do have access to a PGPROC, there
could be a separate entry point that passes backendid instead of PID to
eliminate the search.
        regards, tom lane


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Wed, Dec 10, 2008 at 1:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> I'm surprised you feel that way. You suggested earlier
>> (http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us)
>> that a solution that only works for processes attached to shared memory
>> would probably suffice for now.
>
> Well, I wasn't complaining about the dependence on being attached to
> shared memory.  What I'm complaining about is the dependence on the
> rather complex PGPROC data structure.
>
>> That seems hard, considering that we also want it to work without
>> locking. Hmm, I presume we can use spinlocks in a signal handler?
>> Perhaps some sort of a hash table protected by a spinlock would work.
>
> No, locks are right out if the postmaster is supposed to be able to use
> it.  What I was thinking of is a simple linear array of PIDs and
> sig_atomic_t flags.  The slots could be assigned on the basis of
> backendid, but callers trying to send a signal would have to scan the
> array looking for the matching PID.  (This doesn't seem outlandishly
> expensive considering that one is about to do a kernel call anyway.
> You might be able to save a few cycles by having the PID array separate
> from the flag array, which should improve the cache friendliness of the
> scan.)  Also, for those callers who do have access to a PGPROC, there
> could be a separate entry point that passes backendid instead of PID to
> eliminate the search.

Thanks for the comment!
I updated the patch so. Is this patch ready to apply?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> Hi,
>
> On Wed, Dec 10, 2008 at 1:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> I'm surprised you feel that way. You suggested earlier
>>> (http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us)
>>> that a solution that only works for processes attached to shared memory
>>> would probably suffice for now.
>> Well, I wasn't complaining about the dependence on being attached to
>> shared memory.  What I'm complaining about is the dependence on the
>> rather complex PGPROC data structure.
>>
>>> That seems hard, considering that we also want it to work without
>>> locking. Hmm, I presume we can use spinlocks in a signal handler?
>>> Perhaps some sort of a hash table protected by a spinlock would work.
>> No, locks are right out if the postmaster is supposed to be able to use
>> it.  What I was thinking of is a simple linear array of PIDs and
>> sig_atomic_t flags.  The slots could be assigned on the basis of
>> backendid, but callers trying to send a signal would have to scan the
>> array looking for the matching PID.  (This doesn't seem outlandishly
>> expensive considering that one is about to do a kernel call anyway.
>> You might be able to save a few cycles by having the PID array separate
>> from the flag array, which should improve the cache friendliness of the
>> scan.)  Also, for those callers who do have access to a PGPROC, there
>> could be a separate entry point that passes backendid instead of PID to
>> eliminate the search.
>
> Thanks for the comment!
> I updated the patch so. Is this patch ready to apply?

Looks like we stepped on each others toes, I was just about to send a
similar patch. Attached is my version, it's essentially the same as yours.

My version doesn't have support for auxiliary processes. Does the
synchronous replication patch need that?

This comment is wrong, though the code below it is right:

> *** base/src/backend/bootstrap/bootstrap.c    2008-12-10 16:29:10.000000000 +0900
> --- new/src/backend/bootstrap/bootstrap.c    2008-12-10 17:16:23.000000000 +0900
> ***************
> *** 389,394 ****
> --- 389,403 ----
>           InitAuxiliaryProcess();
>   #endif
>
> +         /*
> +          * Assign backend ID to auxiliary processes like backends, in order to
> +          * allow multiplexing signal to auxiliary processes. Since backends use
> +          * ID in the range from 1 to MaxBackends, we assign auxiliary processes
> +          * with MaxBackends + AuxProcType as an unique ID.
> +          */
> +         MyBackendId = MaxBackends + auxType;
> +         MyProc->backendId = MyBackendId;
> +
>           /* finish setting up bufmgr.c */
>           InitBufferPoolBackend();

Backends use IDs in the range 0 to MaxBackends-1, inclusive.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
*** src/backend/postmaster/autovacuum.c
--- src/backend/postmaster/autovacuum.c
***************
*** 91,96 ****
--- 91,97 ----
  #include "storage/pmsignal.h"
  #include "storage/proc.h"
  #include "storage/procarray.h"
+ #include "storage/procsignal.h"
  #include "storage/sinvaladt.h"
  #include "tcop/tcopprot.h"
  #include "utils/dynahash.h"
***************
*** 1477,1483 **** AutoVacWorkerMain(int argc, char *argv[])
      pqsignal(SIGALRM, handle_sig_alarm);

      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, CatchupInterruptHandler);
      /* We don't listen for async notifies */
      pqsignal(SIGUSR2, SIG_IGN);
      pqsignal(SIGFPE, FloatExceptionHandler);
--- 1478,1484 ----
      pqsignal(SIGALRM, handle_sig_alarm);

      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, procsignal_sigusr1_handler);
      /* We don't listen for async notifies */
      pqsignal(SIGUSR2, SIG_IGN);
      pqsignal(SIGFPE, FloatExceptionHandler);
*** src/backend/storage/ipc/Makefile
--- src/backend/storage/ipc/Makefile
***************
*** 15,21 **** override CFLAGS+= -fno-inline
  endif
  endif

! OBJS = ipc.o ipci.o pmsignal.o procarray.o shmem.o shmqueue.o \
      sinval.o sinvaladt.o

  include $(top_srcdir)/src/backend/common.mk
--- 15,21 ----
  endif
  endif

! OBJS = ipc.o ipci.o pmsignal.o procarray.o procsignal.o shmem.o shmqueue.o \
      sinval.o sinvaladt.o

  include $(top_srcdir)/src/backend/common.mk
*** src/backend/storage/ipc/ipci.c
--- src/backend/storage/ipc/ipci.c
***************
*** 30,35 ****
--- 30,36 ----
  #include "storage/pg_shmem.h"
  #include "storage/pmsignal.h"
  #include "storage/procarray.h"
+ #include "storage/procsignal.h"
  #include "storage/sinvaladt.h"
  #include "storage/spin.h"

***************
*** 111,116 **** CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
--- 112,118 ----
          size = add_size(size, SInvalShmemSize());
          size = add_size(size, BgWriterShmemSize());
          size = add_size(size, AutoVacuumShmemSize());
+         size = add_size(size, ProcSignalShmemSize());
          size = add_size(size, BTreeShmemSize());
          size = add_size(size, SyncScanShmemSize());
  #ifdef EXEC_BACKEND
***************
*** 207,212 **** CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
--- 209,215 ----
      PMSignalInit();
      BgWriterShmemInit();
      AutoVacuumShmemInit();
+     ProcSignalShmemInit();

      /*
       * Set up other modules that need some shared memory space
*** /dev/null
--- src/backend/storage/ipc/procsignal.c
***************
*** 0 ****
--- 1,214 ----
+ /*-------------------------------------------------------------------------
+  *
+  * procsignal.c
+  *      routines for signaling processes
+  *
+  *
+  * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * IDENTIFICATION
+  *      $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+
+ #include <signal.h>
+ #include <unistd.h>
+
+ #include "miscadmin.h"
+ #include "storage/backendid.h"
+ #include "storage/ipc.h"
+ #include "storage/procsignal.h"
+ #include "storage/shmem.h"
+ #include "storage/sinval.h"
+
+ /*
+  * SIGUSR1 signal is multiplexed for multiple events. The specific reason
+  * is communicated via flags in shared memory. We keep a boolean flag for
+  * each possible "reason", so that different reasons can be signaled by
+  * different backends at the same time.    (However, if the same reason is
+  * signaled more than once simultaneously, the process will observe it only
+  * once.)
+  *
+  * Each process that wants to receive signals registers its process ID
+  * in the ProcSignalSlots array. The array is indexed by backend ID to make
+  * slot allocation simple, and to avoid having to search the array when you
+  * know the backend ID of the process you're signaling.
+  *
+  * The flags are actually declared as "volatile sig_atomic_t" for maximum
+  * portability.  This should ensure that loads and stores of the flag
+  * values are atomic, allowing us to dispense with any explicit locking.
+  */
+ typedef struct {
+     pid_t        pss_pid;
+     sig_atomic_t pss_signalFlags[NUM_PROCSIGNALS];
+ } ProcSignalSlot;
+
+ static ProcSignalSlot *ProcSignalSlots;
+
+ static bool CheckProcSignal(ProcSignalReason reason);
+ static void CleanupProcSignalState(int status, Datum arg);
+
+ /*
+  * ProcSignalShmemInit - initialize during shared-memory creation
+  */
+ Size
+ ProcSignalShmemSize(void)
+ {
+     return MaxBackends * sizeof(ProcSignalSlot);
+ }
+
+
+ /*
+  * ProcSignalShmemInit - initialize during shared-memory creation
+  */
+ void
+ ProcSignalShmemInit(void)
+ {
+     bool        found;
+
+     ProcSignalSlots = (ProcSignalSlot *)
+         ShmemInitStruct("ProcSignalSlots", ProcSignalShmemSize(),
+                         &found);
+
+     if (!found)
+         MemSet(ProcSignalSlots, 0, ProcSignalShmemSize());
+ }
+
+ /*
+  * ProcSignalInit - register a process to the procsignal array
+  */
+ void
+ ProcSignalInit(void)
+ {
+     volatile ProcSignalSlot *slot;
+
+     Assert(MyBackendId >= 0 && MyBackendId < MaxBackends);
+
+     slot = &ProcSignalSlots[MyBackendId];
+     MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t));
+     slot->pss_pid = MyProcPid;
+
+     on_shmem_exit(CleanupProcSignalState, Int32GetDatum(MyBackendId));
+ }
+
+
+ /*
+  * CleanupProcSignalState
+  *        Remove current process from ProcSignalSlots
+  *
+  * This function is called via on_shmem_exit() during backend shutdown.
+  *
+  * arg is really of type "BackendId".
+  */
+ static void
+ CleanupProcSignalState(int status, Datum arg)
+ {
+     BackendId    backendId = (BackendId) DatumGetInt32(arg);
+     volatile ProcSignalSlot *slot;
+
+     slot = &ProcSignalSlots[backendId];
+
+     /* sanity check */
+     if (slot->pss_pid != MyProcPid)
+     {
+         /*
+          * don't ERROR here. We're exiting anyway, and don't want to
+          * get into infinite loop trying to exit
+          */
+         elog(WARNING, "current process not in ProcSignalSlots array");
+         return;
+     }
+     slot->pss_pid = 0;
+ }
+
+ /*
+  * SendProcSignal - signal a process with a reason code
+  *
+  * Providing backendId is optional, but it will speed up the operation.
+  *
+  * Not to be confused with ProcSendSignal
+  */
+ void
+ SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
+ {
+     volatile ProcSignalSlot *slot;
+
+     if (backendId != InvalidBackendId)
+     {
+         slot = &ProcSignalSlots[backendId];
+
+         /*
+          * Note: Since there's no locking, it's possible that the target
+          * process detaches from shared memory and exits right after this
+          * test, before we set the flag and send signal. And the signal slot
+          * might even be recycled by a new process, so it's remotely possible
+          * that we set a flag for a wrong process. That's OK, all the signals
+          * are such that no harm is done if they're mistakenly fired.
+          */
+         if (slot->pss_pid == pid)
+         {
+             /* Atomically set the proper flag */
+             slot->pss_signalFlags[reason] = true;
+             /* Send signal */
+             kill(pid, SIGUSR1);
+         }
+     }
+     else
+     {
+         /* same as above, but linear search the array using pid */
+         int i;
+         for (i = 0; i < MaxBackends; i++)
+         {
+             slot = &ProcSignalSlots[i];
+             if (slot->pss_pid == pid)
+             {
+                 /* the above note about race conditions applies here too */
+
+                 /* Atomically set the proper flag */
+                 slot->pss_signalFlags[reason] = true;
+                 /* Send signal */
+                 kill(pid, SIGUSR1);
+
+                 break;
+             }
+         }
+     }
+ }
+
+ /*
+  * CheckProcSignal - check to see if a particular reason has been
+  * signaled, and clear the signal flag.  Should be called after receiving
+  * SIGUSR1.
+  */
+ static bool
+ CheckProcSignal(ProcSignalReason reason)
+ {
+     volatile ProcSignalSlot *slot;
+
+     slot = &ProcSignalSlots[MyBackendId];
+
+     /* Careful here --- don't clear flag if we haven't seen it set */
+     if (slot->pss_signalFlags[reason])
+     {
+         slot->pss_signalFlags[reason] = false;
+         return true;
+     }
+     return false;
+ }
+
+ /*
+  * procsignal_sigusr1_handler - handle SIGUSR1 signal.
+  */
+ void
+ procsignal_sigusr1_handler(SIGNAL_ARGS)
+ {
+     int        save_errno = errno;
+
+     if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT))
+         HandleCatchupInterrupt();
+
+     errno = save_errno;
+ }
*** src/backend/storage/ipc/sinval.c
--- src/backend/storage/ipc/sinval.c
***************
*** 26,33 ****
   * Because backends sitting idle will not be reading sinval events, we
   * need a way to give an idle backend a swift kick in the rear and make
   * it catch up before the sinval queue overflows and forces it to go
!  * through a cache reset exercise.    This is done by sending SIGUSR1
!  * to any backend that gets too far behind.
   *
   * State for catchup events consists of two flags: one saying whether
   * the signal handler is currently allowed to call ProcessCatchupEvent
--- 26,33 ----
   * Because backends sitting idle will not be reading sinval events, we
   * need a way to give an idle backend a swift kick in the rear and make
   * it catch up before the sinval queue overflows and forces it to go
!  * through a cache reset exercise.    This is done by sending
!  * PROCSIG_CATHCUP_INTERRUPT to any backend that gets too far behind.
   *
   * State for catchup events consists of two flags: one saying whether
   * the signal handler is currently allowed to call ProcessCatchupEvent
***************
*** 144,152 **** ReceiveSharedInvalidMessages(


  /*
!  * CatchupInterruptHandler
   *
!  * This is the signal handler for SIGUSR1.
   *
   * If we are idle (catchupInterruptEnabled is set), we can safely
   * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
--- 144,152 ----


  /*
!  * HandleCatchupInterrupt
   *
!  * This is called when PROCSIG_CATCHUP_INTERRUPT signal is received.
   *
   * If we are idle (catchupInterruptEnabled is set), we can safely
   * invoke ProcessCatchupEvent directly.  Otherwise, just set a flag
***************
*** 156,168 **** ReceiveSharedInvalidMessages(
   * since there's no longer any reason to do anything.)
   */
  void
! CatchupInterruptHandler(SIGNAL_ARGS)
  {
-     int            save_errno = errno;
-
      /*
!      * Note: this is a SIGNAL HANDLER.    You must be very wary what you do
!      * here.
       */

      /* Don't joggle the elbow of proc_exit */
--- 156,166 ----
   * since there's no longer any reason to do anything.)
   */
  void
! HandleCatchupInterrupt(void)
  {
      /*
!      * Note: this called by a SIGNAL HANDLER.  You must be very wary what
!      * you do here.
       */

      /* Don't joggle the elbow of proc_exit */
***************
*** 216,223 **** CatchupInterruptHandler(SIGNAL_ARGS)
           */
          catchupInterruptOccurred = 1;
      }
-
-     errno = save_errno;
  }

  /*
--- 214,219 ----
***************
*** 289,295 **** DisableCatchupInterrupt(void)
  /*
   * ProcessCatchupEvent
   *
!  * Respond to a catchup event (SIGUSR1) from another backend.
   *
   * This is called either directly from the SIGUSR1 signal handler,
   * or the next time control reaches the outer idle loop (assuming
--- 285,292 ----
  /*
   * ProcessCatchupEvent
   *
!  * Respond to a catchup event (PROCSIG_CATCHUP_INTERRUPT) from another
!  * backend.
   *
   * This is called either directly from the SIGUSR1 signal handler,
   * or the next time control reaches the outer idle loop (assuming
*** src/backend/storage/ipc/sinvaladt.c
--- src/backend/storage/ipc/sinvaladt.c
***************
*** 21,26 ****
--- 21,27 ----
  #include "storage/backendid.h"
  #include "storage/ipc.h"
  #include "storage/proc.h"
+ #include "storage/procsignal.h"
  #include "storage/shmem.h"
  #include "storage/sinvaladt.h"
  #include "storage/spin.h"
***************
*** 138,143 **** typedef struct ProcState
--- 139,145 ----
  {
      /* procPid is zero in an inactive ProcState array entry. */
      pid_t        procPid;        /* PID of backend, for signaling */
+     BackendId    backendId;        /* backend ID, for signaling */
      /* nextMsgNum is meaningless if procPid == 0 or resetState is true. */
      int            nextMsgNum;        /* next message number to read */
      bool        resetState;        /* backend needs to reset its state */
***************
*** 236,241 **** CreateSharedInvalidationState(void)
--- 238,244 ----
      for (i = 0; i < shmInvalBuffer->maxBackends; i++)
      {
          shmInvalBuffer->procState[i].procPid = 0;            /* inactive */
+         shmInvalBuffer->procState[i].backendId = InvalidBackendId;
          shmInvalBuffer->procState[i].nextMsgNum = 0;        /* meaningless */
          shmInvalBuffer->procState[i].resetState = false;
          shmInvalBuffer->procState[i].signaled = false;
***************
*** 304,309 **** SharedInvalBackendInit(void)
--- 307,313 ----

      /* mark myself active, with all extant messages already read */
      stateP->procPid = MyProcPid;
+     stateP->backendId = MyBackendId;
      stateP->nextMsgNum = segP->maxMsgNum;
      stateP->resetState = false;
      stateP->signaled = false;
***************
*** 342,347 **** CleanupInvalidationState(int status, Datum arg)
--- 346,352 ----

      /* Mark myself inactive */
      stateP->procPid = 0;
+     stateP->backendId = InvalidBackendId;
      stateP->nextMsgNum = 0;
      stateP->resetState = false;
      stateP->signaled = false;
***************
*** 644,661 **** SICleanupQueue(bool callerHasWriteLock, int minFree)
          segP->nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM;

      /*
!      * Lastly, signal anyone who needs a catchup interrupt.  Since kill()
!      * might not be fast, we don't want to hold locks while executing it.
       */
      if (needSig)
      {
!         pid_t    his_pid = needSig->procPid;

          needSig->signaled = true;
          LWLockRelease(SInvalReadLock);
          LWLockRelease(SInvalWriteLock);
          elog(DEBUG4, "sending sinval catchup signal to PID %d", (int) his_pid);
!         kill(his_pid, SIGUSR1);
          if (callerHasWriteLock)
              LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
      }
--- 649,668 ----
          segP->nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM;

      /*
!      * Lastly, signal anyone who needs a catchup interrupt.  Since
!      * SendProcSignal() might not be fast, we don't want to hold locks while
!      * executing it.
       */
      if (needSig)
      {
!         pid_t his_pid = needSig->procPid;
!         BackendId his_backendId = needSig->backendId;

          needSig->signaled = true;
          LWLockRelease(SInvalReadLock);
          LWLockRelease(SInvalWriteLock);
          elog(DEBUG4, "sending sinval catchup signal to PID %d", (int) his_pid);
!         SendProcSignal(his_pid, PROCSIG_CATCHUP_INTERRUPT, his_backendId);
          if (callerHasWriteLock)
              LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE);
      }
*** src/backend/tcop/postgres.c
--- src/backend/tcop/postgres.c
***************
*** 59,64 ****
--- 59,65 ----
  #include "storage/bufmgr.h"
  #include "storage/ipc.h"
  #include "storage/proc.h"
+ #include "storage/procsignal.h"
  #include "storage/sinval.h"
  #include "tcop/fastpath.h"
  #include "tcop/pquery.h"
***************
*** 3180,3186 **** PostgresMain(int argc, char *argv[], const char *username)
       * of output during who-knows-what operation...
       */
      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, CatchupInterruptHandler);
      pqsignal(SIGUSR2, NotifyInterruptHandler);
      pqsignal(SIGFPE, FloatExceptionHandler);

--- 3181,3187 ----
       * of output during who-knows-what operation...
       */
      pqsignal(SIGPIPE, SIG_IGN);
!     pqsignal(SIGUSR1, procsignal_sigusr1_handler);
      pqsignal(SIGUSR2, NotifyInterruptHandler);
      pqsignal(SIGFPE, FloatExceptionHandler);

*** src/backend/utils/init/postinit.c
--- src/backend/utils/init/postinit.c
***************
*** 39,44 ****
--- 39,45 ----
  #include "storage/lmgr.h"
  #include "storage/proc.h"
  #include "storage/procarray.h"
+ #include "storage/procsignal.h"
  #include "storage/sinvaladt.h"
  #include "storage/smgr.h"
  #include "utils/acl.h"
***************
*** 445,450 **** InitPostgres(const char *in_dbname, Oid dboid, const char *username,
--- 446,453 ----
      if (MyBackendId > MaxBackends || MyBackendId <= 0)
          elog(FATAL, "bad backend id: %d", MyBackendId);

+     ProcSignalInit();
+
      /*
       * bufmgr needs another initialization call too
       */
*** /dev/null
--- src/include/storage/procsignal.h
***************
*** 0 ****
--- 1,42 ----
+ /*-------------------------------------------------------------------------
+  *
+  * procsignal.h
+  *      routines for signaling processes XXX
+  *
+  *
+  * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ #ifndef PROCSIGNAL_H
+ #define PROCSIGNAL_H
+
+
+ /*
+  * Reasons for signaling a process (a backend or an auxiliary process, like
+  * autovacuum worker). We can cope with simultaneous signals for different
+  * reasons. If the same reason is signaled multiple times in quick succession,
+  * however, the process is likely to observe only one notification of it.
+  * This is okay for the present uses.
+  */
+ typedef enum
+ {
+        PROCSIG_CATCHUP_INTERRUPT,      /* catchup interrupt */
+        NUM_PROCSIGNALS                         /* Must be last value of enum! */
+ } ProcSignalReason;
+
+ /*
+  * prototypes for functions in procsignal.c
+  */
+ extern Size ProcSignalShmemSize(void);
+ extern void ProcSignalShmemInit(void);
+ extern void ProcSignalInit(void);
+ extern void SendProcSignal(pid_t pid, ProcSignalReason reason,
+                            BackendId backendId);
+
+ extern void procsignal_sigusr1_handler(SIGNAL_ARGS);
+
+ #endif   /* PROCSIGNAL_H */
*** src/include/storage/sinval.h
--- src/include/storage/sinval.h
***************
*** 89,96 **** extern void ReceiveSharedInvalidMessages(
                        void (*invalFunction) (SharedInvalidationMessage *msg),
                               void (*resetFunction) (void));

! /* signal handler for catchup events (SIGUSR1) */
! extern void CatchupInterruptHandler(SIGNAL_ARGS);

  /*
   * enable/disable processing of catchup events directly from signal handler.
--- 89,96 ----
                        void (*invalFunction) (SharedInvalidationMessage *msg),
                               void (*resetFunction) (void));

! /* signal handler for catchup events (PROCSIG_CATCHUP_INTERRUPT) */
! extern void HandleCatchupInterrupt(void);

  /*
   * enable/disable processing of catchup events directly from signal handler.

Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> Hi,
> 
> On Wed, Dec 10, 2008 at 1:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> I'm surprised you feel that way. You suggested earlier
>>> (http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us)
>>> that a solution that only works for processes attached to shared memory
>>> would probably suffice for now.
>> Well, I wasn't complaining about the dependence on being attached to
>> shared memory.  What I'm complaining about is the dependence on the
>> rather complex PGPROC data structure.
>>
>>> That seems hard, considering that we also want it to work without
>>> locking. Hmm, I presume we can use spinlocks in a signal handler?
>>> Perhaps some sort of a hash table protected by a spinlock would work.
>> No, locks are right out if the postmaster is supposed to be able to use
>> it.  What I was thinking of is a simple linear array of PIDs and
>> sig_atomic_t flags.  The slots could be assigned on the basis of
>> backendid, but callers trying to send a signal would have to scan the
>> array looking for the matching PID.  (This doesn't seem outlandishly
>> expensive considering that one is about to do a kernel call anyway.
>> You might be able to save a few cycles by having the PID array separate
>> from the flag array, which should improve the cache friendliness of the
>> scan.)  Also, for those callers who do have access to a PGPROC, there
>> could be a separate entry point that passes backendid instead of PID to
>> eliminate the search.
> 
> Thanks for the comment!
> I updated the patch so. Is this patch ready to apply?

On closer look, I don't see anything setting ProcSignalData.pid field. 
Which make me believe that the patch can't possibly work.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

> My version doesn't have support for auxiliary processes. Does the
> synchronous replication patch need that?

Yes, the background writer also generates the WAL like a backend,
so it has to be able to communicate with walsender.

> On closer look, I don't see anything setting ProcSignalData.pid field. Which
> make me believe that the patch can't possibly work.

Ooops! My patch has some bugs :(
I will update the patch based on yours, and add the support for auxiliary
processes into it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Thu, Dec 11, 2008 at 10:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Hi,
>
>> My version doesn't have support for auxiliary processes. Does the
>> synchronous replication patch need that?
>
> Yes, the background writer also generates the WAL like a backend,
> so it has to be able to communicate with walsender.
>
>> On closer look, I don't see anything setting ProcSignalData.pid field. Which
>> make me believe that the patch can't possibly work.
>
> Ooops! My patch has some bugs :(
> I will update the patch based on yours, and add the support for auxiliary
> processes into it.

Or, should I leave renewal of the patch to you? Of course, if you don't have
time, I will do.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
Fujii Masao wrote:
> On Thu, Dec 11, 2008 at 10:55 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> I will update the patch based on yours, and add the support for auxiliary
>> processes into it.
> 
> Or, should I leave renewal of the patch to you? Of course, if you don't have
> time, I will do.

I can do it, it's just not a big deal.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
Markus Wanner
Date:
Hi,

Alvaro Herrera wrote:
> No, the signalling needed here is far simpler than Markus' IMessage
> stuff.

Yup, see also Tom's comment [1].

For Postgres-R I'm currently multiplexing by embedding a message type in
the imessage data itself. So this part is certainly overlapping, yes.

Some of the messages I'm using do have additional payload data, others
don't. Moving this message type out of the "body" part of the message
itself and instead use the upcoming signal multiplexing could save a few
imessage types in favor of using these multiplexed signals. Most message
types require some additional data to be transferred, though.

From my point of view it's hard to understand why one should want to
move out exactly 32 or 64 bits (sig_atomic_t) of a message. From the
point of view of Postgres, it's certainly better than being bound to the
existing Unix signals.

Regards

Markus Wanner

[1]:
http://archives.postgresql.org/message-id/28487.1221147665@sss.pgh.pa.us


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

Sorry for this late reply.

On Thu, Dec 11, 2008 at 3:12 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Fujii Masao wrote:
>>
>> On Thu, Dec 11, 2008 at 10:55 AM, Fujii Masao <masao.fujii@gmail.com>
>> wrote:
>>>
>>> I will update the patch based on yours, and add the support for auxiliary
>>> processes into it.
>>
>> Or, should I leave renewal of the patch to you? Of course, if you don't
>> have
>> time, I will do.
>
> I can do it, it's just not a big deal.

Heikki, thanks for your offer! I'd like to leave it to you if it's OK with you.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Multiplexing SUGUSR1

From
Bruce Momjian
Date:
Is this for 8.4?

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> I've been looking at the signal handling part of the synchronous 
> replication patch. It looks OK, but one thing makes me worry.
> 
> To set or clear the flag from PGPROC, to send or handle a signal, we 
> have to acquire ProcArrayLock. Is that safe to do in a signal handler? 
> And is the performance impact of that acceptable?
> 
> 
> Another observation is that the patch introduces a new function called 
> SendProcSignal. Nothing wrong with that, except that there's an existing 
> function called ProcSendSignal, just above SendProcSignal, so there's 
> some potential for confusion. The old ProcSendSignal function uses the 
> per-backend semaphore to wake up a backend. It's only used to wait for 
> the cleanup lock in bufmgr.c. I'm tempted to remove that altogether, and 
> use the new signal multiplexing for that too, but OTOH if it works, 
> maybe I shouldn't touch it.
> 
> Attached is a patch with some minor changes I've made. Mostly cosmetic, 
> but I did modify the sinval code so that ProcState has PGPROC pointer 
> instead of backend pid, so that we don't need to search the ProcArray to 
> find the PGPROC struct of the backend we're signaling.
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com


> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Multiplexing SUGUSR1

From
Heikki Linnakangas
Date:
It's required by the sync replication patch, but has no value otherwise.

Bruce Momjian wrote:
> Is this for 8.4?
> 
> ---------------------------------------------------------------------------
> 
> Heikki Linnakangas wrote:
>> I've been looking at the signal handling part of the synchronous 
>> replication patch. It looks OK, but one thing makes me worry.
>>
>> To set or clear the flag from PGPROC, to send or handle a signal, we 
>> have to acquire ProcArrayLock. Is that safe to do in a signal handler? 
>> And is the performance impact of that acceptable?
>>
>>
>> Another observation is that the patch introduces a new function called 
>> SendProcSignal. Nothing wrong with that, except that there's an existing 
>> function called ProcSendSignal, just above SendProcSignal, so there's 
>> some potential for confusion. The old ProcSendSignal function uses the 
>> per-backend semaphore to wake up a backend. It's only used to wait for 
>> the cleanup lock in bufmgr.c. I'm tempted to remove that altogether, and 
>> use the new signal multiplexing for that too, but OTOH if it works, 
>> maybe I shouldn't touch it.
>>
>> Attached is a patch with some minor changes I've made. Mostly cosmetic,
>> but I did modify the sinval code so that ProcState has PGPROC pointer 
>> instead of backend pid, so that we don't need to search the ProcArray to 
>> find the PGPROC struct of the backend we're signaling.
>>
>> -- 
>>    Heikki Linnakangas
>>    EnterpriseDB   http://www.enterprisedb.com
> 
> 
>> -- 
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
> 


--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Wed, Jan 7, 2009 at 3:12 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> It's required by the sync replication patch, but has no value otherwise.

Yes. I'm reworking Synch Rep also including the patch.

BTW, attached is the patch.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Multiplexing SUGUSR1

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> It's required by the sync replication patch, but has no value otherwise.

Well, we have talked about allowing more signalling long-term, and this
would accomplish that independent of the sync replication, so we might
want to revisit this someday if it isn't included in sync replication.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Multiplexing SUGUSR1

From
Greg Stark
Date:
On 7 Jan 2009, at 09:47, Bruce Momjian <bruce@momjian.us> wrote:

> Heikki Linnakangas wrote:
>> It's required by the sync replication patch, but has no value  
>> otherwise.
>
> Well, we have talked about allowing more signalling long-term, and  
> this
> would accomplish that independent of the sync replication, so we might
> want to revisit this someday if it isn't included in sync replication.

I also needed this for the progress indicator patch. I used SIGQUIT  
for the proof-of-concept patch but I wouldn't want to lose that signal  
for real.

-- 
Greg


Re: Multiplexing SUGUSR1

From
Bruce Momjian
Date:
Greg Stark wrote:
> 
> On 7 Jan 2009, at 09:47, Bruce Momjian <bruce@momjian.us> wrote:
> 
> > Heikki Linnakangas wrote:
> >> It's required by the sync replication patch, but has no value  
> >> otherwise.
> >
> > Well, we have talked about allowing more signalling long-term, and  
> > this
> > would accomplish that independent of the sync replication, so we might
> > want to revisit this someday if it isn't included in sync replication.
> 
> I also needed this for the progress indicator patch. I used SIGQUIT  
> for the proof-of-concept patch but I wouldn't want to lose that signal  
> for real.

Yep, we want multiplexed signals independent of sync replication.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Thu, Jan 8, 2009 at 2:14 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Greg Stark wrote:
>>
>> On 7 Jan 2009, at 09:47, Bruce Momjian <bruce@momjian.us> wrote:
>>
>> > Heikki Linnakangas wrote:
>> >> It's required by the sync replication patch, but has no value
>> >> otherwise.
>> >
>> > Well, we have talked about allowing more signalling long-term, and
>> > this
>> > would accomplish that independent of the sync replication, so we might
>> > want to revisit this someday if it isn't included in sync replication.
>>
>> I also needed this for the progress indicator patch. I used SIGQUIT
>> for the proof-of-concept patch but I wouldn't want to lose that signal
>> for real.
>
> Yep, we want multiplexed signals independent of sync replication.

The updated patch of multiplexing SIGUSR1 is included in v5 of
synch-rep patch. (01_signal_handling.patch)
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Version_History

This patch can be also reviewed independent of synch-rep.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Multiplexing SUGUSR1

From
"Fujii Masao"
Date:
Hi,

On Sun, Jan 11, 2009 at 7:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Hi,
>
> On Thu, Jan 8, 2009 at 2:14 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> Greg Stark wrote:
>>>
>>> On 7 Jan 2009, at 09:47, Bruce Momjian <bruce@momjian.us> wrote:
>>>
>>> > Heikki Linnakangas wrote:
>>> >> It's required by the sync replication patch, but has no value
>>> >> otherwise.
>>> >
>>> > Well, we have talked about allowing more signalling long-term, and
>>> > this
>>> > would accomplish that independent of the sync replication, so we might
>>> > want to revisit this someday if it isn't included in sync replication.
>>>
>>> I also needed this for the progress indicator patch. I used SIGQUIT
>>> for the proof-of-concept patch but I wouldn't want to lose that signal
>>> for real.
>>
>> Yep, we want multiplexed signals independent of sync replication.
>
> The updated patch of multiplexing SIGUSR1 is included in v5 of
> synch-rep patch. (01_signal_handling.patch)
> http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Version_History
>
> This patch can be also reviewed independent of synch-rep.

Is this patch going to be reviewed and committed for 8.4?
Though I think that this is useful also to the patch of those other
than replication.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center