Re: Multiplexing SUGUSR1 - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: Multiplexing SUGUSR1 |
Date | |
Msg-id | 0791A9FC-DBBF-4CDD-880A-D7B151295A54@enterprisedb.com Whole thread Raw |
In response to | Multiplexing SUGUSR1 (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Multiplexing SUGUSR1
|
List | pgsql-hackers |
Does this signal multiplexing solve the "out of signals" problem we have generally? I need another signal for the progress indicator too. Or is this only useful for other users which need the same locks or other resources? -- Greg On 8 Dec 2008, at 08:04, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com > wrote: > I've been looking at the signal handling part of the synchronous > replication patch. It looks OK, but one thing makes me worry. > > To set or clear the flag from PGPROC, to send or handle a signal, we > have to acquire ProcArrayLock. Is that safe to do in a signal > handler? And is the performance impact of that acceptable? > > > Another observation is that the patch introduces a new function > called SendProcSignal. Nothing wrong with that, except that there's > an existing function called ProcSendSignal, just above > SendProcSignal, so there's some potential for confusion. The old > ProcSendSignal function uses the per-backend semaphore to wake up a > backend. It's only used to wait for the cleanup lock in bufmgr.c. > I'm tempted to remove that altogether, and use the new signal > multiplexing for that too, but OTOH if it works, maybe I shouldn't > touch it. > > Attached is a patch with some minor changes I've made. Mostly > cosmetic, but I did modify the sinval code so that ProcState has > PGPROC pointer instead of backend pid, so that we don't need to > search the ProcArray to find the PGPROC struct of the backend we're > signaling. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > *** a/src/backend/access/transam/twophase.c --- b/src/backend/access/ > transam/twophase.c *************** *** 287,292 **** > MarkAsPreparing(TransactionId xid, const char *gid, --- 287,293 ---- > gxact->proc.databaseId = databaseid; gxact->proc.roleId = owner; > gxact->proc.inCommit = false; + gxact->proc.signalFlags = 0; > gxact->proc.vacuumFlags = 0; gxact->proc.lwWaiting = false; gxact- > >proc.lwExclusive = false; *** a/src/backend/commands/async.c --- b/ > src/backend/commands/async.c *************** *** 915,923 **** > EnableNotifyInterrupt(void) * a frontend command. Signal handler > execution of inbound notifies * is disabled until the next > EnableNotifyInterrupt call. * ! * The SIGUSR1 signal handler also > needs to call this, so as to ! * prevent conflicts if one signal > interrupts the other. So we ! * must return the previous state of > the flag. */ bool DisableNotifyInterrupt(void) --- 915,924 ---- * a > frontend command. Signal handler execution of inbound notifies * is > disabled until the next EnableNotifyInterrupt call. * ! * This also > needs to be called when SIGUSR1 with ! * PROCSIG_CATCHUP_INTERRUPT > is received, so as to prevent conflicts ! * if one signal > interrupts the other. So we must return the previous ! * state of > the flag. */ bool DisableNotifyInterrupt(void) *************** *** > 954,960 **** ProcessIncomingNotify(void) nulls[Natts_pg_listener]; > bool catchup_enabled; ! /* Must prevent SIGUSR1 interrupt while I am > running */ catchup_enabled = DisableCatchupInterrupt(); if > (Trace_notify) --- 955,961 ---- nulls[Natts_pg_listener]; bool > catchup_enabled; ! /* Must prevent catchup interrupt while I am > running */ catchup_enabled = DisableCatchupInterrupt(); if > (Trace_notify) *** a/src/backend/postmaster/autovacuum.c --- b/src/ > backend/postmaster/autovacuum.c *************** *** 1477,1483 **** > AutoVacWorkerMain(int argc, char *argv[]) pqsignal(SIGALRM, > handle_sig_alarm); pqsignal(SIGPIPE, SIG_IGN); ! pqsignal(SIGUSR1, > CatchupInterruptHandler); /* We don't listen for async notifies */ > pqsignal(SIGUSR2, SIG_IGN); pqsignal(SIGFPE, FloatExceptionHandler); > --- 1477,1483 ---- pqsignal(SIGALRM, handle_sig_alarm); > pqsignal(SIGPIPE, SIG_IGN); ! pqsignal(SIGUSR1, > proc_sigusr1_handler); /* We don't listen for async notifies */ > pqsignal(SIGUSR2, SIG_IGN); pqsignal(SIGFPE, FloatExceptionHandler); > *** a/src/backend/storage/ipc/sinval.c --- b/src/backend/storage/ipc/ > sinval.c *************** *** 27,33 **** * need a way to give an idle > backend a swift kick in the rear and make * it catch up before the > sinval queue overflows and forces it to go * through a cache reset > exercise. This is done by sending SIGUSR1 ! * to any backend that > gets too far behind. * * State for catchup events consists of two > flags: one saying whether * the signal handler is currently allowed > to call ProcessCatchupEvent --- 27,34 ---- * need a way to give an > idle backend a swift kick in the rear and make * it catch up before > the sinval queue overflows and forces it to go * through a cache > reset exercise. This is done by sending SIGUSR1 ! * with > PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far ! * > behind. * * State for catchup events consists of two flags: one > saying whether * the signal handler is currently allowed to call > ProcessCatchupEvent *************** *** 144,152 **** > ReceiveSharedInvalidMessages( /* ! * CatchupInterruptHandler * ! * > This is the signal handler for SIGUSR1. * * If we are idle > (catchupInterruptEnabled is set), we can safely * invoke > ProcessCatchupEvent directly. Otherwise, just set a flag --- 145,154 > ---- /* ! * HandleCatchupInterrupt * ! * This is called when SIGUSR1 > with PROCSIG_CATCHUP_INTERRUPT is ! * received. * * If we are idle > (catchupInterruptEnabled is set), we can safely * invoke > ProcessCatchupEvent directly. Otherwise, just set a flag > *************** *** 156,168 **** ReceiveSharedInvalidMessages( * > since there's no longer any reason to do anything.) */ void ! > CatchupInterruptHandler(SIGNAL_ARGS) { - int save_errno = errno; - / > * ! * Note: this is a SIGNAL HANDLER. You must be very wary what > you do ! * here. */ /* Don't joggle the elbow of proc_exit */ --- > 158,168 ---- * since there's no longer any reason to do anything.) > */ void ! HandleCatchupInterrupt(void) { /* ! * Note: this is called > by a SIGNAL HANDLER. ! * You must be very wary what you do here. > */ /* Don't joggle the elbow of proc_exit */ *************** *** > 216,223 **** CatchupInterruptHandler(SIGNAL_ARGS) */ > catchupInterruptOccurred = 1; } - - errno = save_errno; } /* --- > 216,221 ---- *************** *** 289,295 **** > DisableCatchupInterrupt(void) /* * ProcessCatchupEvent * ! * Respond > to a catchup event (SIGUSR1) from another backend. * * This is > called either directly from the SIGUSR1 signal handler, * or the > next time control reaches the outer idle loop (assuming --- 287,294 > ---- /* * ProcessCatchupEvent * ! * Respond to a catchup event > (SIGUSR1 with PROCSIG_CATCHUP_INTERRUPT) ! * from another backend. * > * This is called either directly from the SIGUSR1 signal handler, * > or the next time control reaches the outer idle loop (assuming *** a/ > src/backend/storage/ipc/sinvaladt.c --- b/src/backend/storage/ipc/ > sinvaladt.c *************** *** 21,26 **** --- 21,27 ---- #include > "storage/backendid.h" #include "storage/ipc.h" #include "storage/ > proc.h" + #include "storage/procarray.h" #include "storage/shmem.h" > #include "storage/sinvaladt.h" #include "storage/spin.h" > *************** *** 136,144 **** /* Per-backend state in shared > invalidation structure */ typedef struct ProcState { ! /* procPid is > zero in an inactive ProcState array entry. */ ! pid_t procPid; /* > PID of backend, for signaling */ ! /* nextMsgNum is meaningless if > procPid == 0 or resetState is true. */ int nextMsgNum; /* next > message number to read */ bool resetState; /* backend needs to > reset its state */ bool signaled; /* backend has been sent catchup > signal */ --- 137,145 ---- /* Per-backend state in shared > invalidation structure */ typedef struct ProcState { ! /* proc is > NULL in an inactive ProcState array entry. */ ! PGPROC *proc; /* > PGPROC entry of backend, for signaling */ ! /* nextMsgNum is > meaningless if proc == NULL or resetState is true. */ int > nextMsgNum; /* next message number to read */ bool resetState; /* > backend needs to reset its state */ bool signaled; /* backend has > been sent catchup signal */ *************** *** 235,241 **** > CreateSharedInvalidationState(void) /* Mark all backends > inactive, and initialize nextLXID */ for (i = 0; i < shmInvalBuffer- > >maxBackends; i++) { ! shmInvalBuffer->procState[i].procPid = > 0; /* inactive */ shmInvalBuffer->procState[i].nextMsgNum = 0; /* > meaningless */ shmInvalBuffer->procState[i].resetState = false; > shmInvalBuffer->procState[i].signaled = false; --- 236,242 ---- /* > Mark all backends inactive, and initialize nextLXID */ for (i = 0; i > < shmInvalBuffer->maxBackends; i++) { ! shmInvalBuffer- > >procState[i].proc = NULL; /* inactive */ shmInvalBuffer- > >procState[i].nextMsgNum = 0; /* meaningless */ shmInvalBuffer- > >procState[i].resetState = false; shmInvalBuffer- > >procState[i].signaled = false; *************** *** 266,272 **** > SharedInvalBackendInit(void) /* Look for a free entry in the > procState array */ for (index = 0; index < segP->lastBackend; index+ > +) { ! if (segP->procState[index].procPid == 0) /* inactive slot? > */ { stateP = &segP->procState[index]; break; --- 267,273 ---- /* > Look for a free entry in the procState array */ for (index = 0; > index < segP->lastBackend; index++) { ! if (segP- > >procState[index].proc == NULL) /* inactive slot? */ { stateP = > &segP->procState[index]; break; *************** *** 278,284 **** > SharedInvalBackendInit(void) if (segP->lastBackend < segP- > >maxBackends) { stateP = &segP->procState[segP->lastBackend]; ! > Assert(stateP->procPid == 0); segP->lastBackend++; } else --- > 279,285 ---- if (segP->lastBackend < segP->maxBackends) { stateP = > &segP->procState[segP->lastBackend]; ! Assert(stateP->proc == NULL); > segP->lastBackend++; } else *************** *** 303,309 **** > SharedInvalBackendInit(void) nextLocalTransactionId = stateP- > >nextLXID; /* mark myself active, with all extant messages already > read */ ! stateP->procPid = MyProcPid; stateP->nextMsgNum = segP- > >maxMsgNum; stateP->resetState = false; stateP->signaled = false; > --- 304,310 ---- nextLocalTransactionId = stateP->nextLXID; /* mark > myself active, with all extant messages already read */ ! stateP- > >proc = MyProc; stateP->nextMsgNum = segP->maxMsgNum; stateP- > >resetState = false; stateP->signaled = false; *************** *** > 341,347 **** CleanupInvalidationState(int status, Datum arg) stateP- > >nextLXID = nextLocalTransactionId; /* Mark myself inactive */ ! > stateP->procPid = 0; stateP->nextMsgNum = 0; stateP->resetState = > false; stateP->signaled = false; --- 342,348 ---- stateP->nextLXID = > nextLocalTransactionId; /* Mark myself inactive */ ! stateP- > >proc = NULL; stateP->nextMsgNum = 0; stateP->resetState = false; > stateP->signaled = false; *************** *** 349,355 **** > CleanupInvalidationState(int status, Datum arg) /* Recompute index > of last active backend */ for (i = segP->lastBackend; i > 0; i--) > { ! if (segP->procState[i - 1].procPid != 0) break; } segP- > >lastBackend = i; --- 350,356 ---- /* Recompute index of last active > backend */ for (i = segP->lastBackend; i > 0; i--) { ! if (segP- > >procState[i - 1].proc != NULL) break; } segP->lastBackend = i; > *************** *** 374,380 **** BackendIdIsActive(int backendID) > { ProcState *stateP = &segP->procState[backendID - 1]; ! result = > (stateP->procPid != 0); } else result = false; --- 375,381 ---- > { ProcState *stateP = &segP->procState[backendID - 1]; ! result = > (stateP->proc != NULL); } else result = false; *************** *** > 590,596 **** SICleanupQueue(bool callerHasWriteLock, int minFree) > int n = stateP->nextMsgNum; /* Ignore if inactive or already in > reset state */ ! if (stateP->procPid == 0 || stateP->resetState) > continue; /* --- 591,597 ---- int n = stateP->nextMsgNum; /* Ignore > if inactive or already in reset state */ ! if (stateP->proc == NULL > || stateP->resetState) continue; /* *************** *** 644,661 **** > SICleanupQueue(bool callerHasWriteLock, int minFree) segP- > >nextThreshold = (numMsgs / CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM; / > * ! * Lastly, signal anyone who needs a catchup interrupt. Since > kill() ! * might not be fast, we don't want to hold locks while > executing it. */ if (needSig) { ! pid_t his_pid = needSig->procPid; > needSig->signaled = true; LWLockRelease(SInvalReadLock); > LWLockRelease(SInvalWriteLock); ! elog(DEBUG4, "sending sinval > catchup signal to PID %d", (int) his_pid); ! kill(his_pid, SIGUSR1); > if (callerHasWriteLock) LWLockAcquire(SInvalWriteLock, > LW_EXCLUSIVE); } --- 645,664 ---- segP->nextThreshold = (numMsgs / > CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM; /* ! * Lastly, signal anyone > who needs a catchup interrupt. Since ! * SendProcSignal() might not > be fast, we don't want to hold locks while ! * executing it. */ if > (needSig) { ! PGPROC *his_proc = needSig->proc; needSig->signaled = > true; LWLockRelease(SInvalReadLock); > LWLockRelease(SInvalWriteLock); ! elog(DEBUG4, "sending sinval > catchup signal to PID %d", ! (int) his_proc->pid); ! > SendProcSignal(his_proc, PROCSIG_CATCHUP_INTERRUPT); if > (callerHasWriteLock) LWLockAcquire(SInvalWriteLock, LW_EXCLUSIVE); } > *** a/src/backend/storage/lmgr/proc.c --- b/src/backend/storage/lmgr/ > proc.c *************** *** 289,294 **** InitProcess(void) --- > 289,295 ---- MyProc->databaseId = InvalidOid; MyProc->roleId = > InvalidOid; MyProc->inCommit = false; + MyProc->signalFlags = 0; > MyProc->vacuumFlags = 0; if (IsAutoVacuumWorkerProcess()) MyProc- > >vacuumFlags |= PROC_IS_AUTOVACUUM; *************** *** 428,433 **** > InitAuxiliaryProcess(void) --- 429,435 ---- MyProc->databaseId = > InvalidOid; MyProc->roleId = InvalidOid; MyProc->inCommit = false; + > MyProc->signalFlags = 0; /* we don't set the "is autovacuum" flag in > the launcher */ MyProc->vacuumFlags = 0; MyProc->lwWaiting = false; > *************** *** 1277,1282 **** ProcSendSignal(int pid) --- > 1279,1330 ---- PGSemaphoreUnlock(&proc->sem); } + /* + * > SendProcSignal - send the signal with the reason to the process + * > (such as backend, autovacuum worker and auxiliary process) + * > identified by proc. + */ + void + SendProcSignal(PGPROC *proc, uint8 > reason) + { + int pid; + + if (proc == NULL) + return; + + > LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); + pid = proc->pid; + if > (pid != 0) + proc->signalFlags |= reason; + > LWLockRelease(ProcArrayLock); + + /* Send SIGUSR1 to the process */ > + kill(pid, SIGUSR1); + } + + /* + * CheckProcSignal - check to see > if the particular reason has been + * signaled, and clear the signal > flag. Should be called after + * receiving SIGUSR1. + */ + bool + > CheckProcSignal(uint8 reason) + { + LWLockAcquire(ProcArrayLock, > LW_EXCLUSIVE); + + /* Careful here --- don't clear flag if we > haven't seen it set */ + if (MyProc->signalFlags & reason) + { + > MyProc->signalFlags &= ~reason; + LWLockRelease(ProcArrayLock); + > return true; + } + + LWLockRelease(ProcArrayLock); + + return > false; + } + / > *** > *** > *** > ******************************************************************** > * SIGALRM interrupt support *** a/src/backend/tcop/postgres.c --- b/ > src/backend/tcop/postgres.c *************** *** 2437,2442 **** > drop_unnamed_stmt(void) --- 2437,2464 ---- */ /* + * > proc_sigusr1_handler - handle SIGUSR1 signal. + * + * SIGUSR1 is > multiplexed to handle multiple different events. The signalFlags + * > bitmask in PGPROC indicates which events have been signaled. + */ + > void + proc_sigusr1_handler(SIGNAL_ARGS) + { + int save_errno = > errno; + + if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT)) + { + /* > + * Catchup interrupt has been sent. + */ + > HandleCatchupInterrupt(); + } + + errno = save_errno; + } + + /* * > quickdie() occurs when signalled SIGQUIT by the postmaster. * * Some > backend has bought the farm, *************** *** 3180,3186 **** > PostgresMain(int argc, char *argv[], const char *username) * of > output during who-knows-what operation... */ pqsignal(SIGPIPE, > SIG_IGN); ! pqsignal(SIGUSR1, CatchupInterruptHandler); > pqsignal(SIGUSR2, NotifyInterruptHandler); pqsignal(SIGFPE, > FloatExceptionHandler); --- 3202,3208 ---- * of output during who- > knows-what operation... */ pqsignal(SIGPIPE, SIG_IGN); ! > pqsignal(SIGUSR1, proc_sigusr1_handler); pqsignal(SIGUSR2, > NotifyInterruptHandler); pqsignal(SIGFPE, FloatExceptionHandler); > *** a/src/include/storage/proc.h --- b/src/include/storage/proc.h > *************** *** 38,43 **** struct XidCache --- 38,46 ---- > TransactionId xids[PGPROC_MAX_CACHED_SUBXIDS]; }; + /* Signals for > PGPROC->signalFlags */ + #define PROCSIG_CATCHUP_INTERRUPT 0x01 /* > catchup interrupt */ + /* Flags for PGPROC->vacuumFlags */ #define > PROC_IS_AUTOVACUUM 0x01 /* is it an autovac worker? */ #define > PROC_IN_VACUUM 0x02 /* currently running lazy vacuum */ > *************** *** 91,96 **** struct PGPROC --- 94,100 ---- bool > inCommit; /* true if within commit critical section */ + uint8 > signalFlags; /* bitmask of signals raised, see above */ uint8 > vacuumFlags; /* vacuum-related flags, see above */ /* Info about > LWLock the process is currently waiting for, if any. */ > *************** *** 171,176 **** extern void LockWaitCancel(void); > --- 175,183 ---- extern void ProcWaitForSignal(void); extern void > ProcSendSignal(int pid); + extern void SendProcSignal(PGPROC *proc, > uint8 reason); + extern bool CheckProcSignal(uint8 reason); + extern > bool enable_sig_alarm(int delayms, bool is_statement_timeout); > extern bool disable_sig_alarm(bool is_statement_timeout); extern > void handle_sig_alarm(SIGNAL_ARGS); *** a/src/include/storage/ > sinval.h --- b/src/include/storage/sinval.h *************** *** > 90,96 **** extern void ReceiveSharedInvalidMessages( void > (*resetFunction) (void)); /* signal handler for catchup events > (SIGUSR1) */ ! extern void CatchupInterruptHandler(SIGNAL_ARGS); /* > * enable/disable processing of catchup events directly from signal > handler. --- 90,96 ---- void (*resetFunction) (void)); /* signal > handler for catchup events (SIGUSR1) */ ! extern void > HandleCatchupInterrupt(void); /* * enable/disable processing of > catchup events directly from signal handler. *** a/src/include/tcop/ > tcopprot.h --- b/src/include/tcop/tcopprot.h *************** *** > 56,61 **** extern List *pg_plan_queries(List *querytrees, int > cursorOptions, --- 56,62 ---- extern bool assign_max_stack_depth(int > newval, bool doit, GucSource source); + extern void > proc_sigusr1_handler(SIGNAL_ARGS); extern void die(SIGNAL_ARGS); > extern void quickdie(SIGNAL_ARGS); extern void authdie(SIGNAL_ARGS); > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: