Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly |
Date | |
Msg-id | 200204140454.g3E4skw02153@candle.pha.pa.us Whole thread Raw |
In response to | [patch] fe-connect.c doesn't handle EINTR correctly (David Ford <david+cert@blue-labs.org>) |
Responses |
Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly
|
List | pgsql-hackers |
David, sorry you patch didn't make it into 7.2.X. That whole EINTR discussion was quite complicated so I am not surprised we missed it. The attached patch implements your ENITR test in cases that seems to need it. I have followed the method we used for ENITRY in fe-misc.c. --------------------------------------------------------------------------- David Ford wrote: > Last year we had a drawn out discussion about this and I created a patch > for it. I never noticed that the patch didn't go in until I installed > 7.2 the other day and realised that fe-connect.c never was fixed. > > Here is the patch again. It is against CVS 3/16/2002. This time I only > rewrote the connect procedure at line 912, I leave it up to the regular > hackers to copy it's functionality to the SSL procedure just below it. > > In summary, if a software writer implements timer events or other events > which generate a signal with a timing fast enough to occur while libpq > is inside connect(), then connect returns -EINTR. The code following > the connect call does not handle this and generates an error message. > The sum result is that the pg_connect() fails. If the timer or other > event is right on the window of the connect() completion time, the > pg_connect() may appear to work sporadically. If the event is too slow, > pg_connect() will appear to always work and if the event is too fast, > pg_connect() will always fail. > > David > > Index: src/interfaces/libpq/fe-connect.c > =================================================================== > RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v > retrieving revision 1.181 > diff -u -r1.181 fe-connect.c > --- src/interfaces/libpq/fe-connect.c 2001/11/11 02:09:05 1.181 > +++ src/interfaces/libpq/fe-connect.c 2002/03/16 05:17:47 > @@ -909,29 +909,48 @@ > * Thus, we have to make arrangements for all eventualities. > * ---------- > */ > - if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0) > - { > - if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0) > - { > - /* > - * This is fine - we're in non-blocking mode, and the > - * connection is in progress. > - */ > - conn->status = CONNECTION_STARTED; > - } > - else > - { > - /* Something's gone wrong */ > - connectFailureMessage(conn, SOCK_ERRNO); > - goto connect_errReturn; > + do { > + int e; > + e=connect(conn->sock, &conn->raddr.sa, conn->raddr_len) > + > + if(e < 0) { > + switch (e) { > + case EINTR: > + /* > + * Interrupted by a signal, keep trying. This handling is > + * required because the user may have turned on signals in > + * his program. Previously, libpq would erronously fail to > + * connect if the user's timer event fired and interrupted > + * this syscall. It is important that we don't try to sleep > + * here because this may cause havoc with the user program. > + */ > + continue; > + break; > + case 0: > + case EINPROGRESS: > + case EWOULDBLOCK: > + /* > + * This is fine - we're in non-blocking mode, and the > + * connection is in progress. > + */ > + conn->status = CONNECTION_STARTED; > + break; > + default: > + /* Something's gone wrong */ > + connectFailureMessage(conn, SOCK_ERRNO); > + goto connect_errReturn; > + break; > + } > + } else { > + /* We're connected now */ > + conn->status = CONNECTION_MADE; > } > - } > - else > - { > - /* We're connected already */ > - conn->status = CONNECTION_MADE; > - } > + > + if(conn->status == CONNECTION_STARTED || conn->status == CONNECTION_MADE) > + break; > > + } while(1); > + > #ifdef USE_SSL > /* Attempt to negotiate SSL usage */ > if (conn->allow_ssl_try) > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 Index: src/interfaces/libpq/fe-connect.c =================================================================== RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v retrieving revision 1.182 diff -c -r1.182 fe-connect.c *** src/interfaces/libpq/fe-connect.c 2 Mar 2002 00:49:22 -0000 1.182 --- src/interfaces/libpq/fe-connect.c 14 Apr 2002 04:40:24 -0000 *************** *** 913,920 **** --- 913,925 ---- * Thus, we have to make arrangements for all eventualities. * ---------- */ + retry1: if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0) { + if (SOCK_ERRNO == EINTR) + /* Interrupted system call - we'll just try again */ + goto retry1; + if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0) { /* *************** *** 949,957 **** --- 954,967 ---- SOCK_STRERROR(SOCK_ERRNO)); goto connect_errReturn; } + retry2: /* Now receive the postmasters response */ if (recv(conn->sock, &SSLok, 1, 0) != 1) { + if (SOCK_ERRNO == EINTR) + /* Interrupted system call - we'll just try again */ + goto retry2; + printfPQExpBuffer(&conn->errorMessage, libpq_gettext("could not receive server response to SSL negotiation packet: %s\n"), SOCK_STRERROR(SOCK_ERRNO)); *************** *** 2132,2139 **** --- 2142,2153 ---- "PQrequestCancel() -- socket() failed: "); goto cancel_errReturn; } + retry3: if (connect(tmpsock, &conn->raddr.sa, conn->raddr_len) < 0) { + if (SOCK_ERRNO == EINTR) + /* Interrupted system call - we'll just try again */ + goto retry3; strcpy(conn->errorMessage.data, "PQrequestCancel() -- connect() failed: "); goto cancel_errReturn; *************** *** 2150,2157 **** --- 2164,2175 ---- crp.cp.backendPID = htonl(conn->be_pid); crp.cp.cancelAuthCode = htonl(conn->be_key); + retry4: if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp)) { + if (SOCK_ERRNO == EINTR) + /* Interrupted system call - we'll just try again */ + goto retry4; strcpy(conn->errorMessage.data, "PQrequestCancel() -- send() failed: "); goto cancel_errReturn; Index: src/interfaces/libpq/fe-misc.c =================================================================== RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-misc.c,v retrieving revision 1.68 diff -c -r1.68 fe-misc.c *** src/interfaces/libpq/fe-misc.c 6 Mar 2002 06:10:42 -0000 1.68 --- src/interfaces/libpq/fe-misc.c 14 Apr 2002 04:40:25 -0000 *************** *** 361,367 **** if (!conn || conn->sock < 0) return -1; ! retry: FD_ZERO(&input_mask); FD_SET(conn->sock, &input_mask); timeout.tv_sec = 0; --- 361,367 ---- if (!conn || conn->sock < 0) return -1; ! retry1: FD_ZERO(&input_mask); FD_SET(conn->sock, &input_mask); timeout.tv_sec = 0; *************** *** 371,377 **** { if (SOCK_ERRNO == EINTR) /* Interrupted system call - we'll just try again */ ! goto retry; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), --- 371,377 ---- { if (SOCK_ERRNO == EINTR) /* Interrupted system call - we'll just try again */ ! goto retry1; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), *************** *** 395,401 **** if (!conn || conn->sock < 0) return -1; ! retry: FD_ZERO(&input_mask); FD_SET(conn->sock, &input_mask); timeout.tv_sec = 0; --- 395,401 ---- if (!conn || conn->sock < 0) return -1; ! retry2: FD_ZERO(&input_mask); FD_SET(conn->sock, &input_mask); timeout.tv_sec = 0; *************** *** 405,411 **** { if (SOCK_ERRNO == EINTR) /* Interrupted system call - we'll just try again */ ! goto retry; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), --- 405,411 ---- { if (SOCK_ERRNO == EINTR) /* Interrupted system call - we'll just try again */ ! goto retry2; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), *************** *** 478,484 **** } /* OK, try to read some data */ ! tryAgain: #ifdef USE_SSL if (conn->ssl) nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, --- 478,484 ---- } /* OK, try to read some data */ ! retry3: #ifdef USE_SSL if (conn->ssl) nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, *************** *** 490,496 **** if (nread < 0) { if (SOCK_ERRNO == EINTR) ! goto tryAgain; /* Some systems return EAGAIN/EWOULDBLOCK for no data */ #ifdef EAGAIN if (SOCK_ERRNO == EAGAIN) --- 490,496 ---- if (nread < 0) { if (SOCK_ERRNO == EINTR) ! goto retry3; /* Some systems return EAGAIN/EWOULDBLOCK for no data */ #ifdef EAGAIN if (SOCK_ERRNO == EAGAIN) *************** *** 531,537 **** (conn->inBufSize - conn->inEnd) >= 8192) { someread = 1; ! goto tryAgain; } return 1; } --- 531,537 ---- (conn->inBufSize - conn->inEnd) >= 8192) { someread = 1; ! goto retry3; } return 1; } *************** *** 564,570 **** * Still not sure that it's EOF, because some data could have just * arrived. */ ! tryAgain2: #ifdef USE_SSL if (conn->ssl) nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, --- 564,570 ---- * Still not sure that it's EOF, because some data could have just * arrived. */ ! retry4: #ifdef USE_SSL if (conn->ssl) nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, *************** *** 576,582 **** if (nread < 0) { if (SOCK_ERRNO == EINTR) ! goto tryAgain2; /* Some systems return EAGAIN/EWOULDBLOCK for no data */ #ifdef EAGAIN if (SOCK_ERRNO == EAGAIN) --- 576,582 ---- if (nread < 0) { if (SOCK_ERRNO == EINTR) ! goto retry4; /* Some systems return EAGAIN/EWOULDBLOCK for no data */ #ifdef EAGAIN if (SOCK_ERRNO == EAGAIN) *************** *** 804,810 **** if (forRead || forWrite) { ! retry: FD_ZERO(&input_mask); FD_ZERO(&output_mask); FD_ZERO(&except_mask); --- 804,810 ---- if (forRead || forWrite) { ! retry5: FD_ZERO(&input_mask); FD_ZERO(&output_mask); FD_ZERO(&except_mask); *************** *** 817,823 **** (struct timeval *) NULL) < 0) { if (SOCK_ERRNO == EINTR) ! goto retry; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), SOCK_STRERROR(SOCK_ERRNO)); --- 817,823 ---- (struct timeval *) NULL) < 0) { if (SOCK_ERRNO == EINTR) ! goto retry5; printfPQExpBuffer(&conn->errorMessage, libpq_gettext("select() failed: %s\n"), SOCK_STRERROR(SOCK_ERRNO));
pgsql-hackers by date: