Re: Reducing WaitEventSet syscall churn - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Reducing WaitEventSet syscall churn |
Date | |
Msg-id | CA+hUKG+=sEsxM847hkDM37CtXZbAqi+Q6un2uUpPPkhuKmAHzQ@mail.gmail.com Whole thread Raw |
In response to | Re: Reducing WaitEventSet syscall churn (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Reducing WaitEventSet syscall churn
|
List | pgsql-hackers |
On Sun, Jul 12, 2020 at 7:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > [...complaints about 0005 and 0006...] We already have > PGEVT_CONNRESET and PGEVT_CONNDESTROY as application-visible connection > state change events, and I don't see why those aren't sufficient. I think Horiguchi-san's general idea of using event callbacks for this sounds much more promising than my earlier idea of exposing a change counter (that really was terrible). If it can be done with existing events then that's even better. Perhaps he and/or I can look into that for the next CF. In the meantime, here's a rebase of the more straightforward patches in the stack. These are the ones that deal only with fixed sets of file descriptors, and they survive check-world on Linux, Linux+EXEC_BACKEND (with ASLR disabled) and FreeBSD, and at least check on macOS and Windows (my CI recipes need more work to get check-world working on those two). There's one user-visible change that I'd appreciate feedback on: I propose to drop the FATAL error when the postmaster goes away, to make things more consistent. See below for more on that. Responding to earlier review from Horiguchi-san: On Tue, Mar 10, 2020 at 12:20 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > 0002: "Use a long lived WaitEventSet for WaitLatch()." > > > > In the last version, I had a new function WaitMyLatch(), but that > > didn't help with recoveryWakeupLatch. Switching between latches > > doesn't require a syscall, so I didn't want to have a separate WES and > > function just for that. So I went back to using plain old > > WaitLatch(), and made it "modify" the latch every time before waiting > > on CommonWaitSet. An alternative would be to get rid of the concept > > of latches other than MyLatch, and change the function to > > WaitMyLatch(). A similar thing happens for exit_on_postmaster_death, > > for which I didn't want to have a separate WES, so I just set that > > flag every time. Thoughts? > > It is surely an improvement from that we create a full-fledged WES > every time. The name CommonWaitSet gives an impression as if it is > used for a variety of waitevent sets, but it is used only by > WaitLatch. So I would name it LatchWaitSet. With that name I won't be > surprised by that the function is pointing WL_LATCH_SET by index 0 > without any explanation when calling ModifyWaitSet. Ok, I changed it to LatchWaitSet. I also replaced the index 0 with a symbolic name LatchWaitSetLatchPos, to make that clearer. > @@ -700,7 +739,11 @@ FreeWaitEventSet(WaitEventSet *set) > ReleaseExternalFD(); > #elif defined(WAIT_USE_KQUEUE) > close(set->kqueue_fd); > - ReleaseExternalFD(); > + if (set->kqueue_fd >= 0) > + { > + close(set->kqueue_fd); > + ReleaseExternalFD(); > + } > > Did you forget to remove the close() outside the if block? > Don't we need the same amendment for epoll_fd with kqueue_fd? Hmm, maybe I screwed that up when resolving a conflict with the ReleaseExternalFD() stuff. Fixed. > WaitLatch is defined as "If the latch is already set (and WL_LATCH_SET > is given), the function returns immediately.". But now the function > reacts to latch even if WL_LATCH_SET is not set. I think actually it > is alwys set so I think we need to modify Assert and function comment > following the change. It seems a bit silly to call WaitLatch() if you don't want to wait for a latch, but I think we can keep that comment and logic by assigning set->latch = NULL when you wait without WL_LATCH_SET. I tried that in the attached. > > 0004: "Introduce RemoveWaitEvent()." > > > > We'll need that to be able to handle connections that are closed and > > reopened under the covers by libpq (replication, postgres_fdw). We > > also wanted this on a couple of other threads for multiplexing FDWs, > > to be able to adjust the wait set dynamically for a proposed async > > Append feature. > > > > The implementation is a little naive, and leaves holes in the > > "pollfds" and "handles" arrays (for poll() and win32 implementations). > > That could be improved with a bit more footwork in a later patch. > > > > XXX The Windows version is based on reading documentation. I'd be > > very interested to know if check-world passes (especially > > contrib/postgres_fdw and src/test/recovery). Unfortunately my > > appveyor.yml fu is not yet strong enough. > > I didn't find the documentation about INVALID_HANDLE_VALUE in > lpHandles. Could you give me the URL for that? I was looking for how you do the equivalent of Unix file descriptor -1 in a call to poll(), and somewhere I read that INVALID_HANDLE_VALUE has the right effect. I can't find that reference now. Apparently it works because that's the pseudo-handle value -1 that is returned by GetCurrentProcess(), and waiting for your own process waits forever so it's a suitable value for holes in an array of event handles. We should probably call GetCurrentProcess() instead, but really that is just stand-in code: we should rewrite it so that we don't need holes! That might require a second array for use by the poll and win32 implementations. Let's return to that in a later CF? On Mon, Mar 30, 2020 at 6:15 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > 0008: "Use WL_EXIT_ON_PM_DEATH in FeBeWaitSet." > > > > The FATAL message you get if you happen to be waiting for IO rather > > than waiting somewhere else seems arbitrarily different. By switching > > to a standard automatic exit, it opens the possibility of using > > FeBeWaitSet in a couple more places that would otherwise need to > > create their own WES (see also [1]). Thoughts? > > Don't we really need any message on backend disconnection due to > postmaster death? As for authentication code, it seems to me the > rationale for not writing the log is that the connection has not been > established at the time. (That depends on what we think the > "connection" is.) To me, it looks like this variation is just from a time when postmaster death handling was less standardised. The comment (removed by this patch) even introduces the topic of postmaster exit as if you've never heard of it, in this arbitrary location in the tree, one wait point among many (admittedly one that is often reached). I don't think there is any good reason for random timing to affect whether or not you get a hand crafted FATAL message. However, if others don't like this change, we could drop this patch and still use FeBeWaitSet for walsender.c. It'd just need to handle postmaster death explicitly. It felt weird to be adding new code that has to handle postmaster death explicitly, which is what led me to notice that the existing FeBeWaitSet coding (and resulting user experience) is different from other code. > > 0009: "Use FeBeWaitSet for walsender.c." > > > > Enabled by 0008. > > It works and doesn't change behavior. But I found it a bit difficult > to find what events the WaitEventSetWait waits. Maybe a comment at > the caller sites would be sufficient. I think any comment about the > bare number "0" as the event position of ModifyWaitEvent is also > needed. Agreed. I changed it to use symbolic names FeBeWaitSet{Socket,Latch}Pos in the new code and also in the pre-existing code like this. > By the way I found that pqcomm.c uses -1 instead of PGINVALID_SOCKET > for AddWaitEventToSet. Oh yeah, that's a pre-existing problem. Fixed, since that code was changed by the patch anyway.
Attachment
- v5-0001-Use-a-long-lived-WaitEventSet-for-WaitLatch.patch
- v5-0002-Use-WaitLatch-for-condition-variables.patch
- v5-0003-Introduce-a-WaitEventSet-for-the-stats-collector.patch
- v5-0004-Use-WL_EXIT_ON_PM_DEATH-in-FeBeWaitSet.patch
- v5-0005-Introduce-symbolic-names-for-FeBeWaitSet-position.patch
- v5-0006-Use-FeBeWaitSet-for-walsender.c.patch
pgsql-hackers by date: