Re: [patch] helps fe-connect.c handle -EINTR more gracefully - Mailing list pgsql-hackers

From David Ford
Subject Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Date
Msg-id 3BDA1705.9020207@blue-labs.org
Whole thread Raw
In response to [patch] helps fe-connect.c handle -EINTR more gracefully  (David Ford <david@blue-labs.org>)
Responses Re: [patch] helps fe-connect.c handle -EINTR more gracefully
List pgsql-hackers
>
>
>
>I think you are missing the point.  I am not saying that we shouldn't
>deal with EINTR; rather I am raising what I think is a legitimate
>question: *what* is the most appropriate response?  My reading of
>HP's gloss suggests that we could treat EINTR the same as EINPROGRESS,
>ie, consider the connect() to have succeeded and move on to the
>wait-for-connection-complete-or-failure-using-select() phase.
>If that works I would prefer it to repeating the connect() call,
>primarily because it avoids any possibility of introducing an
>infinite loop.
>

You wouldn't get an infinite loop, you'd get Exx indicating the 
operation was in progress.  Yes,  you could spin on a select() waiting 
for the end result.  What I normally do is this:

connect()

while(select()) {   switch () {       case EINTR:           break;       case EINPROGRESS:           nanosleep();
   break;       case ETIMEDOUT:       default:           /* handle timeout and other error conditions nicely for the 
 
user */           break;   }
}

With EINTR, it's fine to immediately start working again because your 
code was interrupted from outside this scope.  We don't know where in 
connect() we were interrupted, blocking or non-blocking.  With 
EINPROGRESS I sleep for a while to be nice to the system.  Here we know 
that things are moving along like they should be and we are in a proper 
sleepable period.

That isn't to imply that things will break if we sleep from EINTR.  Only 
that connect() exited due to an interruption, not due to planning.

>
>No, we don't.  If you think that, then you haven't studied the code
>sufficiently to be submitting patches for it.  What we actually do
>is exactly what is recommended by the manpage you're quoting at me.
>It's just split across multiple routines.
>
I traced several calls and they run through a few functions which end 
up in pqFlush.  These code paths haven't checked the socket to see if it 
is ready for RW operation yet.  pqFlush calls send() [ignoring SSL].

Only after a lot of code has been traversed is pqWait run in which the 
socket is checked for RW and EINTR.  My point that I was bringing up 
with Peter was that it's much more nice on the system to wait for the 
socket to become usable before going through all that code.  In the 
previous email I suggested that with a sufficiently fast timer event, 
you'd never get back through the PQconnect* before being interrupted 
again and that's why I advocate putting the EINTR as close to the 
connect() as possible.  Tying this together is why it is possible to 
fail, a good amount of code is traversed before you get back to dealing 
with the socket.  Anywhere inbetween this signal events can happen again.

That's what provoked this original patch.  Unless I shut off my timer or 
changed my timer to happen in the long distant future, I would never 
have a successful connection established.

David




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Next
From: "Robert Dyas"
Date:
Subject: consistent naming of components