Re: [patch] helps fe-connect.c handle -EINTR more gracefully - Mailing list pgsql-hackers
From | David Ford |
---|---|
Subject | Re: [patch] helps fe-connect.c handle -EINTR more gracefully |
Date | |
Msg-id | 3BDA1705.9020207@blue-labs.org Whole thread Raw |
In response to | [patch] helps fe-connect.c handle -EINTR more gracefully (David Ford <david@blue-labs.org>) |
Responses |
Re: [patch] helps fe-connect.c handle -EINTR more gracefully
|
List | pgsql-hackers |
> > > >I think you are missing the point. I am not saying that we shouldn't >deal with EINTR; rather I am raising what I think is a legitimate >question: *what* is the most appropriate response? My reading of >HP's gloss suggests that we could treat EINTR the same as EINPROGRESS, >ie, consider the connect() to have succeeded and move on to the >wait-for-connection-complete-or-failure-using-select() phase. >If that works I would prefer it to repeating the connect() call, >primarily because it avoids any possibility of introducing an >infinite loop. > You wouldn't get an infinite loop, you'd get Exx indicating the operation was in progress. Yes, you could spin on a select() waiting for the end result. What I normally do is this: connect() while(select()) { switch () { case EINTR: break; case EINPROGRESS: nanosleep(); break; case ETIMEDOUT: default: /* handle timeout and other error conditions nicely for the user */ break; } } With EINTR, it's fine to immediately start working again because your code was interrupted from outside this scope. We don't know where in connect() we were interrupted, blocking or non-blocking. With EINPROGRESS I sleep for a while to be nice to the system. Here we know that things are moving along like they should be and we are in a proper sleepable period. That isn't to imply that things will break if we sleep from EINTR. Only that connect() exited due to an interruption, not due to planning. > >No, we don't. If you think that, then you haven't studied the code >sufficiently to be submitting patches for it. What we actually do >is exactly what is recommended by the manpage you're quoting at me. >It's just split across multiple routines. > I traced several calls and they run through a few functions which end up in pqFlush. These code paths haven't checked the socket to see if it is ready for RW operation yet. pqFlush calls send() [ignoring SSL]. Only after a lot of code has been traversed is pqWait run in which the socket is checked for RW and EINTR. My point that I was bringing up with Peter was that it's much more nice on the system to wait for the socket to become usable before going through all that code. In the previous email I suggested that with a sufficiently fast timer event, you'd never get back through the PQconnect* before being interrupted again and that's why I advocate putting the EINTR as close to the connect() as possible. Tying this together is why it is possible to fail, a good amount of code is traversed before you get back to dealing with the socket. Anywhere inbetween this signal events can happen again. That's what provoked this original patch. Unless I shut off my timer or changed my timer to happen in the long distant future, I would never have a successful connection established. David
pgsql-hackers by date: