Re: Optimising latch signals - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Optimising latch signals |
Date | |
Msg-id | CA+hUKGL=hDG8Xq0Xgx-tLEiOpJ_QNB_gRtC=Ta8H=X3DKeUjpw@mail.gmail.com Whole thread Raw |
In response to | Optimising latch signals (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Optimising latch signals
|
List | pgsql-hackers |
On Sun, Aug 9, 2020 at 11:48 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Here are some more experimental patches to reduce system calls. > > 0001 skips sending signals when the recipient definitely isn't > waiting, using a new flag-and-memory-barrier dance. This seems to > skip around 12% of the kill() calls for "make check", and probably > helps with some replication configurations that do a lot of > signalling. Patch by Andres (with a small memory barrier adjustment > by me). > > 0002 gets rid of the latch self-pipe on Linux systems. > > 0003 does the same on *BSD/macOS systems. Here's a rebase over the recent signal handler/mask reorganisation. Some thoughts, on looking at this again after a while: 1. It's a bit clunky that pqinitmask() takes a new argument to say whether SIGURG should be blocked; that's because the knowledge of which latch implementation we're using is private to latch.c, and only the epoll version needs to block it. I wonder how to make that tidier. 2. It's a bit weird to have UnBlockSig (SIGURG remains blocked for epoll builds) and UnBlockAllSig (SIGURG is also unblocked). Maybe the naming is confusing. 3. Maybe it's strange to continue to use overloaded SIGUSR1 on non-epoll, non-kqueue systems; perhaps we should use SIGURG everywhere. 4. As a nano-optimisation, SetLatch() on a latch the current process owns might as well use raise(SIGURG) rather than kill(). This is necessary to close races when SetLatch(MyLatch) runs in a signal handler. In other words, although this patch uses signal blocking to close the race when other processes call SetLatch() and send us SIGURG, there's still a race if, say, SIGINT is sent to the checkpointer and it sets its own latch from its SIGINT handler function; in the user context it may be in WaitEventSetWait() having just seen latch->is_set == false, and now be about to enter epoll_pwait()/kevent() after the signal handler returns, so we need to give it a reason not to go to sleep. By way of motivation for removing the self-pipe, and where possible also the signal handler, here is a trace of the WAL writer handling three requests to write data, on a FreeBSD system, with the patch: kevent(9,0x0,0,{ SIGURG,... },1,{ 0.200000000 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0\0\M-/\^\"...,8192,0xaf0000) = 8192 (0x2000) kevent(9,0x0,0,{ SIGURG,... },1,{ 0.200000000 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0 \M-/\^\\0"...,8192,0xaf2000) = 8192 (0x2000) kevent(9,0x0,0,{ SIGURG,... },1,{ 0.200000000 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0`\M-/\^\\0"...,8192,0xaf6000) = 8192 (0x2000) Here is the same thing on unpatched master: kevent(11,0x0,0,0x801c195b0,1,{ 0.200000000 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffffffc880) EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0`\M-w)\0\0"...,8192,0xf76000) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.200000000 }) = 1 (0x1) read(9,"\0",16) = 1 (0x1) kevent(11,0x0,0,0x801c195b0,1,{ 0.200000000 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffffffc880) EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0 \M-y)\0\0"...,8192,0xf92000) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.200000000 }) = 1 (0x1) SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffffffc880) EJUSTRETURN read(9,"\0\0",16) = 2 (0x2) kevent(11,0x0,0,0x801c195b0,1,{ 0.200000000 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffffffc880) EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0\0\M-z)\0"...,8192,0xfa0000) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.200000000 }) = 1 (0x1) read(9,"\0",16) = 1 (0x1) The improvement isn't quite as good on Linux, because as far as I can tell you still have to have an (empty) signal handler installed and it runs (can we find a way to avoid that?), but you still get to skip all the pipe manipulation.
Attachment
pgsql-hackers by date: