Re: Why is src/test/modules/committs/t/002_standby.pl flaky? - Mailing list pgsql-hackers
| From | Thomas Munro |
|---|---|
| Subject | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? |
| Date | |
| Msg-id | CA+hUKGKUcgaB4hOrcMLWYQchBHZVqp=kwerc-8uH6ZW3ZfvX+Q@mail.gmail.com Whole thread |
| In response to | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? (Andres Freund <andres@anarazel.de>) |
| List | pgsql-hackers |
On Wed, Feb 2, 2022 at 6:38 AM Andres Freund <andres@anarazel.de> wrote:
> On 2022-02-01 18:02:34 +1300, Thomas Munro wrote:
> > 1. "pgsocket" could become a pointer to a heap-allocated wrapper
> > object containing { socket, event, flags } on Windows, or something
> > like that, but that seems a bit invasive and tangled up with public
> > APIs like libpq, which put me off trying that. I'm willing to explore
> > it if people object to my other idea.
>
> I'm not sure if the libpq aspect really is a problem. We're not going to have
> to do that conversion repeatedly, I think.
Alright, I'm prototyping that variant today.
> > Provide a way to get a callback when a socket is created or closed.
> >
> > XXX TODO handle callback failure
> > XXX TODO investigate overheads/other implications of having a callback
> > installed
>
> What do we need this for? I still don't understand what kind of reconnects we
> need to automatically need to detect.
libpq makes new sockets in various cases like when trying multiple
hosts/ports (the easiest test to set up) or in some SSL and GSSAPI
cases. In the model shown in the most recent patch where there is a
hash table holding ExtraSocketState objects that libpq doesn't even
know about, we have to do SocketTableDrop(old socket),
SocketTableAdd(new socket) at those times, which is why I introduced
that callback.
If we switch to the model where a socket is really a pointer to a
wrapper struct (which I'm about to prototype), the need for all that
bookkeeping goes away, no callbacks, no hash table, but now libpq has
to participate knowingly in a socket wrapping scheme to help the
backend while also somehow providing unwrapped SOCKET for client API
stability. Trying some ideas, more on that soon.
> > +#if !defined(FRONTEND)
> > +struct ExtraSocketState
> > +{
> > +#ifdef WIN32
> > + HANDLE event_handle; /* one event for the life of the socket */
> > + int flags; /* most recent WSAEventSelect() flags */
> > + bool seen_fd_close; /* has FD_CLOSE been received? */
> > +#else
> > + int dummy; /* none of this is needed for Unix */
> > +#endif
> > +};
>
> Seems like we might want to track more readiness events than just close? If we
> e.g. started tracking whether we've seen writes blocking / write readiness,
> we could get rid of cruft like
>
> /*
> * Windows does not guarantee to log an FD_WRITE network event
> * indicating that more data can be sent unless the previous send()
> * failed with WSAEWOULDBLOCK. While our caller might well have made
> * such a call, we cannot assume that here. Therefore, if waiting for
> * write-ready, force the issue by doing a dummy send(). If the dummy
> * send() succeeds, assume that the socket is in fact write-ready, and
> * return immediately. Also, if it fails with something other than
> * WSAEWOULDBLOCK, return a write-ready indication to let our caller
> * deal with the error condition.
> */
>
> that seems likely to just make bugs less likely, rather than actually fix them...
Yeah. Unlike FD_CLOSE, FD_WRITE is a non-terminal condition so would
also need to be cleared, which requires seeing all
send()/sendto()/write() calls with wrapper functions, but we already
do stuff like that. Looking into it...
pgsql-hackers by date: