Re: [patch] helps fe-connect.c handle -EINTR more gracefully - Mailing list pgsql-hackers

From David Ford
Subject Re: [patch] helps fe-connect.c handle -EINTR more gracefully
Date
Msg-id 3BD9F714.80606@blue-labs.org
Whole thread Raw
In response to Re: [patch] helps fe-connect.c handle -EINTR more gracefully  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
"The **SA_RESTART**  flag is always set by the underlying system in 
POSIX mode so that interrupted system calls will fail with return value 
of -1 and the *EINTR* error in /errno/ instead of getting restarted." libpq's pqsignal.c doesn't turn off SA_RESTART
forSIGALRM.  Further, 
 
pqsignal.c only handles SIGPIPE and not to mention that other parts of 
libpq do handle EINTR properly.

PQconnect* family does not handle EINTR.  It does not handle the 
possible and perfectly legitimate interruption of a system call. Globally trying to disable system calls from being
interruptedis a Bad 
 
Thing. Having a timer event is common, having a timer event in a daemon 
is often required.  Timers allow for good housekeeping and playing nice 
with the rest of the system.

Your reasonable behavior in the case of EINTR means repeatable and 
mysterious failure.  There isn't a clean way to re-enter PQconnect* 
while maintaining state in the case of signal interruption and no 
guarantee the function won't be interrupted again.

Basically if you have a timer event in your software and you use pgsql, 
then the following will happen.

a) if the timer event always happens outside the PQconnect* call is 
completed your code will function
b) if the timer event always fires during the PQconnect* call, your code 
will never function
c) if your timer event sometimes fires during the PQconnect* call, your 
code will sometimes function

There are no ifs, ands, or buts about it, if a timer fires inside 
PQconnect* as it is now, there is no way to continue. With a suitablly 
long timer period, you can try the PQconnect* call again and if the 
connect succeeds before the timer fires again you're fine.  If not, you 
must repeatedly try.

That said, there are two ways about it.  a) handle it cleanly inside 
PQconnect* like it should be done, or b) have the programmer parse the 
error string for "Interrupted system call" and re-enter PQconnect.  a) 
is clean, short, and simple.  b) wastes a lot of CPU to attempt to 
accomplish the task.  a) is guaranteed and b) is not guaranteed.

David

Peter Eisentraut wrote:

David Ford writes:

>Libpq doesn't deal with system calls being interrupted in the slightest.
> None of the read/write or socket calls handle any errors.  Even benign
>returns i.e. EINTR are treated as fatal errors and returned.  Not to
>malign, but there is no reason not to continue on and handle EINTR.
>

Libpq certainly does deal with system calls being interrupted:  It does
not allow them to be interrupted.  Take a look into the file pqsignal.c to
see why.

If your alarm timer interrupts system calls then that's because you have
installed your signal handler to allow that.  In my mind, a reasonable
behaviour in that case would be to let the PQconnect or equivalent fail
and provide the errno to the application.




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: configure --enable-unicode
Next
From: David Ford
Date:
Subject: Re: [patch] helps fe-connect.c handle -EINTR more gracefully