Thread: modifying WaitEventSets (was: Performance degradation in commit ac1d794)
On Sun, Mar 20, 2016 at 8:31 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > One thing that I want to do but can't with this interface is remove an > fd from the set. I can AddWaitEventToSet returning a position, and I > can ModifyWait to provide new event mask by position including zero > mask, I can't actually remove the fd (for example to avoid future > error events that can't be masked, or to allow that fd to be closed > and perhaps allow that fd number to coincidentally be readded later, > and just generally to free the slot). There is an underlying way to > remove an fd from a set with poll (sort of), epoll, kqueue. (Not sure > about Windows. But surely...). I wonder if there should be > RemoveWaitEventFromSet(set, position) which recycles event slots, > sticking them on a freelist (and setting corresponding pollfd structs' > fd to -1). In practice, you're not likely to want to completely forget about a file descriptor and wait for some totally new file descriptor, or at least not very often. You're much more likely to have a set of fds that you care about, but the events that you care about for each one vary over time, and it may often by the empty set for some of those fds. For example, in the case of Append -> {Foreign Scan x many}, you will have an fd for each foreign scan. When you dispatch a query to a particular server, you're now interested in whether that fd is ready-ready (or, if SSL is involved, maybe sometimes you care about write-ready instead) until you get back all the query results from that server. At that point, you no longer care about events on that fd until you dispatch the next query - and then you again do. So what's the best API for that? One idea is to change ModifyWaitEvent to accept events = 0 instead of failing an assertion inside WaitEventAdjustEpoll. We don't want to wait for EPOLLERR | EPOLLHUP in that case since we'd have to wait event to return if one of those things occurred. It would be easy enough to rejigger that code so that we pass epoll_ev.events as 0 in that case, but I think it doesn't help: epoll_ctl(2) says epoll_wait(2) always waits for those events. Instead I think we'd have to use EPOLL_CTL_DEL in that case, which is a problem: when the user next calls ModifyWaitEvent, we would need to use EPOLL_CTL_ADD rather than EPOLL_CTL_MOD, but how would we know that? We'd have to add state for that somewhere. Alternatively, we could implement RemoveWaitEventFromSet(set, position), as you propose. That, too, needs some kind of additional state, because we've got to track which positions are unused. I'd be inclined to guess that a bitmap would be better than a linked list; for the storage space of one pointer, we could handle all WaitEventSets with nevents <= 64, which is hard to beat. Yet another idea is to have a new event WL_SOCKET_ERROR which waits for only EPOLLERR | EPOLLHUP. So, if you don't care about the socket being readable or writeable at the moment, but you still vaguely care about the file descriptor, you can do ModifyWaitEvent(set, pos, WL_SOCKET_ERROR, NULL). That's actually kind of nice, because now you can still detect error/eof conditions on sockets that you are not otherwise waiting for, and report the errors promptly. At the moment, I'm inclined to think this might be the best option... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: modifying WaitEventSets (was: Performance degradation in commit ac1d794)
From
Andres Freund
Date:
Hi, On 2016-05-03 23:09:28 -0400, Robert Haas wrote: > So what's the best API for that? One idea is to change > ModifyWaitEvent to accept events = 0 instead of failing an assertion > inside WaitEventAdjustEpoll. We don't want to wait for EPOLLERR | > EPOLLHUP in that case since we'd have to wait event to return if one > of those things occurred. It would be easy enough to rejigger that > code so that we pass epoll_ev.events as 0 in that case, but I think it > doesn't help: epoll_ctl(2) says epoll_wait(2) always waits for those > events. Instead I think we'd have to use EPOLL_CTL_DEL in that case, > which is a problem: when the user next calls ModifyWaitEvent, we would > need to use EPOLL_CTL_ADD rather than EPOLL_CTL_MOD, but how would we > know that? We'd have to add state for that somewhere. Right. > Yet another idea is to have a new event WL_SOCKET_ERROR which waits > for only EPOLLERR | EPOLLHUP. So, if you don't care about the socket > being readable or writeable at the moment, but you still vaguely care > about the file descriptor, you can do ModifyWaitEvent(set, pos, > WL_SOCKET_ERROR, NULL). That's actually kind of nice, because now you > can still detect error/eof conditions on sockets that you are not > otherwise waiting for, and report the errors promptly. At the moment, > I'm inclined to think this might be the best option... I generally agree that this is the best option, but there's a noticeable catch: I'm afraid select() doesn't really generally support this kind of operation; all the other providers do. But after trawling the buildfarm logs, that isn't necessarily a problem: It appears that we haven't had a single !windows animal reporting a successfull configure run without HAVE_POLL support this year going by the buildfarm status logs. Running for a longer period now, but it'll take a long while. It appears that OSX with 10.3 was one of the latest to introduce poll() support. Given that poll() has been introduced in SRV3 - which IIRC was below our usual baseline - and windows is not an issue for latch, I think it'd be ok to rely on it. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > Given that poll() has been introduced in SRV3 - which IIRC was below our > usual baseline - and windows is not an issue for latch, I think it'd > be ok to rely on it. I think it's entirely reasonable to say that "if you want high performance you should have poll(3)". Failing to build without it would be a harder sell, probably. regards, tom lane
Re: modifying WaitEventSets (was: Performance degradation in commit ac1d794)
From
Merlin Moncure
Date:
On Wed, May 4, 2016 at 2:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@anarazel.de> writes: >> Given that poll() has been introduced in SRV3 - which IIRC was below our >> usual baseline - and windows is not an issue for latch, I think it'd >> be ok to rely on it. > > I think it's entirely reasonable to say that "if you want high performance > you should have poll(3)". Failing to build without it would be a harder > sell, probably. There are some decent cross platform libraries that expose high performance polling. In particular, libev. http://software.schmorp.de/pkg/libev.html. I would definitely look there before contemplating direct epoll calls. (note, aside from seeing epoll, I haven't been following this thread in detail). merlin
I wrote: > Andres Freund <andres@anarazel.de> writes: >> Given that poll() has been introduced in SRV3 - which IIRC was below our >> usual baseline - and windows is not an issue for latch, I think it'd >> be ok to rely on it. > I think it's entirely reasonable to say that "if you want high performance > you should have poll(3)". Failing to build without it would be a harder > sell, probably. Hmm ... wait, I take that back. poll() is required by SUS v2, which has been our minimum baseline spec for a long time (even my pet dinosaur HPUX has it). As long as we have an answer for Windows, it's hard to argue we can't require poll() elsewhere. regards, tom lane
On Wed, May 4, 2016 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Andres Freund <andres@anarazel.de> writes: >>> Given that poll() has been introduced in SRV3 - which IIRC was below our >>> usual baseline - and windows is not an issue for latch, I think it'd >>> be ok to rely on it. > >> I think it's entirely reasonable to say that "if you want high performance >> you should have poll(3)". Failing to build without it would be a harder >> sell, probably. > > Hmm ... wait, I take that back. poll() is required by SUS v2, which has > been our minimum baseline spec for a long time (even my pet dinosaur HPUX > has it). As long as we have an answer for Windows, it's hard to argue > we can't require poll() elsewhere. I don't think we'd necessarily need to completely de-support people who still depend on select(). We'd just need to say, well, WL_SOCKET_ERROR *may* report exceptional events on the socket, or it may not, depending on how modern your platform is. In the use cases I foresee, that would occasionally result in less-timely detection of FDW connection loss, but nothing worse. I'm not prepared to get very excited about that. But if we are confident that everything supports poll() and it's always better than select(), another, possibly superior option is to remove support for select() and see if anything breaks. If not, then we only need to support three platform-specific implementations instead of four, which I would find it difficult to complain about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, May 4, 2016 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hmm ... wait, I take that back. poll() is required by SUS v2, which has >> been our minimum baseline spec for a long time (even my pet dinosaur HPUX >> has it). As long as we have an answer for Windows, it's hard to argue >> we can't require poll() elsewhere. > I don't think we'd necessarily need to completely de-support people > who still depend on select(). We'd just need to say, well, > WL_SOCKET_ERROR *may* report exceptional events on the socket, or it > may not, depending on how modern your platform is. In the use cases I > foresee, that would occasionally result in less-timely detection of > FDW connection loss, but nothing worse. I'm not prepared to get very > excited about that. I'm not either, but ... > But if we are confident that everything supports poll() and it's > always better than select(), another, possibly superior option is to > remove support for select() and see if anything breaks. If not, then > we only need to support three platform-specific implementations > instead of four, which I would find it difficult to complain about. ... the evidence available suggests that the select() code path has probably received zero buildfarm testing. Do we really want to ship a fourth implementation that we can't even vouch for? regards, tom lane
On Wed, May 4, 2016 at 3:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, May 4, 2016 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Hmm ... wait, I take that back. poll() is required by SUS v2, which has >>> been our minimum baseline spec for a long time (even my pet dinosaur HPUX >>> has it). As long as we have an answer for Windows, it's hard to argue >>> we can't require poll() elsewhere. > >> I don't think we'd necessarily need to completely de-support people >> who still depend on select(). We'd just need to say, well, >> WL_SOCKET_ERROR *may* report exceptional events on the socket, or it >> may not, depending on how modern your platform is. In the use cases I >> foresee, that would occasionally result in less-timely detection of >> FDW connection loss, but nothing worse. I'm not prepared to get very >> excited about that. > > I'm not either, but ... > >> But if we are confident that everything supports poll() and it's >> always better than select(), another, possibly superior option is to >> remove support for select() and see if anything breaks. If not, then >> we only need to support three platform-specific implementations >> instead of four, which I would find it difficult to complain about. > > ... the evidence available suggests that the select() code path has > probably received zero buildfarm testing. Do we really want to ship > a fourth implementation that we can't even vouch for? I'm more than happy to rip it out, either now or after the tree opens for 9.7 development. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: modifying WaitEventSets (was: Performance degradation in commit ac1d794)
From
Andres Freund
Date:
Hi, On 2016-05-04 15:54:32 -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Wed, May 4, 2016 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Hmm ... wait, I take that back. poll() is required by SUS v2, which has > >> been our minimum baseline spec for a long time (even my pet dinosaur HPUX > >> has it). As long as we have an answer for Windows, it's hard to argue > >> we can't require poll() elsewhere. Yea, I think that's the case. I just ran the query for a longer period, and there's not been any report since 2014 of an !windows animal without poll support. > > I don't think we'd necessarily need to completely de-support people > > who still depend on select(). We'd just need to say, well, > > WL_SOCKET_ERROR *may* report exceptional events on the socket, or it > > may not, depending on how modern your platform is. In the use cases I > > foresee, that would occasionally result in less-timely detection of > > FDW connection loss, but nothing worse. I'm not prepared to get very > > excited about that. > > I'm not either, but ... I'm not worried about slightly degraded capabilities there either, more about the (lack of) ability to test that reasonably. > > But if we are confident that everything supports poll() and it's > > always better than select(), another, possibly superior option is to > > remove support for select() and see if anything breaks. If not, then > > we only need to support three platform-specific implementations > > instead of four, which I would find it difficult to complain about. > > ... the evidence available suggests that the select() code path has > probably received zero buildfarm testing. Do we really want to ship > a fourth implementation that we can't even vouch for? I mean I added code to make it easier to test the select path in 9.6, but yea, outside of my machines running the latest linux kernel there is, that code path appears to have had barely any testing over the last few years. Andres
Re: modifying WaitEventSets (was: Performance degradation in commit ac1d794)
From
Andres Freund
Date:
On 2016-05-04 16:05:04 -0400, Robert Haas wrote: > I'm more than happy to rip it out, either now or after the tree opens > for 9.7 development. Let's rip the select support out in 9.7 then; given the relevant code was already written and tested there's no hurry. But if you'd rather do so earlier (as in *now*) I'm not going to protest either.