Re: Why is src/test/modules/committs/t/002_standby.pl flaky? - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? |
Date | |
Msg-id | CA+hUKGKUcgaB4hOrcMLWYQchBHZVqp=kwerc-8uH6ZW3ZfvX+Q@mail.gmail.com Whole thread Raw |
In response to | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Wed, Feb 2, 2022 at 6:38 AM Andres Freund <andres@anarazel.de> wrote: > On 2022-02-01 18:02:34 +1300, Thomas Munro wrote: > > 1. "pgsocket" could become a pointer to a heap-allocated wrapper > > object containing { socket, event, flags } on Windows, or something > > like that, but that seems a bit invasive and tangled up with public > > APIs like libpq, which put me off trying that. I'm willing to explore > > it if people object to my other idea. > > I'm not sure if the libpq aspect really is a problem. We're not going to have > to do that conversion repeatedly, I think. Alright, I'm prototyping that variant today. > > Provide a way to get a callback when a socket is created or closed. > > > > XXX TODO handle callback failure > > XXX TODO investigate overheads/other implications of having a callback > > installed > > What do we need this for? I still don't understand what kind of reconnects we > need to automatically need to detect. libpq makes new sockets in various cases like when trying multiple hosts/ports (the easiest test to set up) or in some SSL and GSSAPI cases. In the model shown in the most recent patch where there is a hash table holding ExtraSocketState objects that libpq doesn't even know about, we have to do SocketTableDrop(old socket), SocketTableAdd(new socket) at those times, which is why I introduced that callback. If we switch to the model where a socket is really a pointer to a wrapper struct (which I'm about to prototype), the need for all that bookkeeping goes away, no callbacks, no hash table, but now libpq has to participate knowingly in a socket wrapping scheme to help the backend while also somehow providing unwrapped SOCKET for client API stability. Trying some ideas, more on that soon. > > +#if !defined(FRONTEND) > > +struct ExtraSocketState > > +{ > > +#ifdef WIN32 > > + HANDLE event_handle; /* one event for the life of the socket */ > > + int flags; /* most recent WSAEventSelect() flags */ > > + bool seen_fd_close; /* has FD_CLOSE been received? */ > > +#else > > + int dummy; /* none of this is needed for Unix */ > > +#endif > > +}; > > Seems like we might want to track more readiness events than just close? If we > e.g. started tracking whether we've seen writes blocking / write readiness, > we could get rid of cruft like > > /* > * Windows does not guarantee to log an FD_WRITE network event > * indicating that more data can be sent unless the previous send() > * failed with WSAEWOULDBLOCK. While our caller might well have made > * such a call, we cannot assume that here. Therefore, if waiting for > * write-ready, force the issue by doing a dummy send(). If the dummy > * send() succeeds, assume that the socket is in fact write-ready, and > * return immediately. Also, if it fails with something other than > * WSAEWOULDBLOCK, return a write-ready indication to let our caller > * deal with the error condition. > */ > > that seems likely to just make bugs less likely, rather than actually fix them... Yeah. Unlike FD_CLOSE, FD_WRITE is a non-terminal condition so would also need to be cleared, which requires seeing all send()/sendto()/write() calls with wrapper functions, but we already do stuff like that. Looking into it...
pgsql-hackers by date: