Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly |
Date | |
Msg-id | 200204152334.g3FNYEO05629@candle.pha.pa.us Whole thread Raw |
In response to | Re: [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly (Bruce Momjian <pgman@candle.pha.pa.us>) |
List | pgsql-hackers |
Fix applied. Thanks. --------------------------------------------------------------------------- Bruce Momjian wrote: > > David, sorry you patch didn't make it into 7.2.X. That whole EINTR > discussion was quite complicated so I am not surprised we missed it. > > The attached patch implements your ENITR test in cases that seems to > need it. I have followed the method we used for ENITRY in fe-misc.c. > > > --------------------------------------------------------------------------- > > David Ford wrote: > > Last year we had a drawn out discussion about this and I created a patch > > for it. I never noticed that the patch didn't go in until I installed > > 7.2 the other day and realised that fe-connect.c never was fixed. > > > > Here is the patch again. It is against CVS 3/16/2002. This time I only > > rewrote the connect procedure at line 912, I leave it up to the regular > > hackers to copy it's functionality to the SSL procedure just below it. > > > > In summary, if a software writer implements timer events or other events > > which generate a signal with a timing fast enough to occur while libpq > > is inside connect(), then connect returns -EINTR. The code following > > the connect call does not handle this and generates an error message. > > The sum result is that the pg_connect() fails. If the timer or other > > event is right on the window of the connect() completion time, the > > pg_connect() may appear to work sporadically. If the event is too slow, > > pg_connect() will appear to always work and if the event is too fast, > > pg_connect() will always fail. > > > > David > > > > > Index: src/interfaces/libpq/fe-connect.c > > =================================================================== > > RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v > > retrieving revision 1.181 > > diff -u -r1.181 fe-connect.c > > --- src/interfaces/libpq/fe-connect.c 2001/11/11 02:09:05 1.181 > > +++ src/interfaces/libpq/fe-connect.c 2002/03/16 05:17:47 > > @@ -909,29 +909,48 @@ > > * Thus, we have to make arrangements for all eventualities. > > * ---------- > > */ > > - if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0) > > - { > > - if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0) > > - { > > - /* > > - * This is fine - we're in non-blocking mode, and the > > - * connection is in progress. > > - */ > > - conn->status = CONNECTION_STARTED; > > - } > > - else > > - { > > - /* Something's gone wrong */ > > - connectFailureMessage(conn, SOCK_ERRNO); > > - goto connect_errReturn; > > + do { > > + int e; > > + e=connect(conn->sock, &conn->raddr.sa, conn->raddr_len) > > + > > + if(e < 0) { > > + switch (e) { > > + case EINTR: > > + /* > > + * Interrupted by a signal, keep trying. This handling is > > + * required because the user may have turned on signals in > > + * his program. Previously, libpq would erronously fail to > > + * connect if the user's timer event fired and interrupted > > + * this syscall. It is important that we don't try to sleep > > + * here because this may cause havoc with the user program. > > + */ > > + continue; > > + break; > > + case 0: > > + case EINPROGRESS: > > + case EWOULDBLOCK: > > + /* > > + * This is fine - we're in non-blocking mode, and the > > + * connection is in progress. > > + */ > > + conn->status = CONNECTION_STARTED; > > + break; > > + default: > > + /* Something's gone wrong */ > > + connectFailureMessage(conn, SOCK_ERRNO); > > + goto connect_errReturn; > > + break; > > + } > > + } else { > > + /* We're connected now */ > > + conn->status = CONNECTION_MADE; > > } > > - } > > - else > > - { > > - /* We're connected already */ > > - conn->status = CONNECTION_MADE; > > - } > > + > > + if(conn->status == CONNECTION_STARTED || conn->status == CONNECTION_MADE) > > + break; > > > > + } while(1); > > + > > #ifdef USE_SSL > > /* Attempt to negotiate SSL usage */ > > if (conn->allow_ssl_try) > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > Index: src/interfaces/libpq/fe-connect.c > =================================================================== > RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v > retrieving revision 1.182 > diff -c -r1.182 fe-connect.c > *** src/interfaces/libpq/fe-connect.c 2 Mar 2002 00:49:22 -0000 1.182 > --- src/interfaces/libpq/fe-connect.c 14 Apr 2002 04:40:24 -0000 > *************** > *** 913,920 **** > --- 913,925 ---- > * Thus, we have to make arrangements for all eventualities. > * ---------- > */ > + retry1: > if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0) > { > + if (SOCK_ERRNO == EINTR) > + /* Interrupted system call - we'll just try again */ > + goto retry1; > + > if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == 0) > { > /* > *************** > *** 949,957 **** > --- 954,967 ---- > SOCK_STRERROR(SOCK_ERRNO)); > goto connect_errReturn; > } > + retry2: > /* Now receive the postmasters response */ > if (recv(conn->sock, &SSLok, 1, 0) != 1) > { > + if (SOCK_ERRNO == EINTR) > + /* Interrupted system call - we'll just try again */ > + goto retry2; > + > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("could not receive server response to SSL negotiation packet: %s\n"), > SOCK_STRERROR(SOCK_ERRNO)); > *************** > *** 2132,2139 **** > --- 2142,2153 ---- > "PQrequestCancel() -- socket() failed: "); > goto cancel_errReturn; > } > + retry3: > if (connect(tmpsock, &conn->raddr.sa, conn->raddr_len) < 0) > { > + if (SOCK_ERRNO == EINTR) > + /* Interrupted system call - we'll just try again */ > + goto retry3; > strcpy(conn->errorMessage.data, > "PQrequestCancel() -- connect() failed: "); > goto cancel_errReturn; > *************** > *** 2150,2157 **** > --- 2164,2175 ---- > crp.cp.backendPID = htonl(conn->be_pid); > crp.cp.cancelAuthCode = htonl(conn->be_key); > > + retry4: > if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp)) > { > + if (SOCK_ERRNO == EINTR) > + /* Interrupted system call - we'll just try again */ > + goto retry4; > strcpy(conn->errorMessage.data, > "PQrequestCancel() -- send() failed: "); > goto cancel_errReturn; > Index: src/interfaces/libpq/fe-misc.c > =================================================================== > RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-misc.c,v > retrieving revision 1.68 > diff -c -r1.68 fe-misc.c > *** src/interfaces/libpq/fe-misc.c 6 Mar 2002 06:10:42 -0000 1.68 > --- src/interfaces/libpq/fe-misc.c 14 Apr 2002 04:40:25 -0000 > *************** > *** 361,367 **** > if (!conn || conn->sock < 0) > return -1; > > ! retry: > FD_ZERO(&input_mask); > FD_SET(conn->sock, &input_mask); > timeout.tv_sec = 0; > --- 361,367 ---- > if (!conn || conn->sock < 0) > return -1; > > ! retry1: > FD_ZERO(&input_mask); > FD_SET(conn->sock, &input_mask); > timeout.tv_sec = 0; > *************** > *** 371,377 **** > { > if (SOCK_ERRNO == EINTR) > /* Interrupted system call - we'll just try again */ > ! goto retry; > > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > --- 371,377 ---- > { > if (SOCK_ERRNO == EINTR) > /* Interrupted system call - we'll just try again */ > ! goto retry1; > > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > *************** > *** 395,401 **** > if (!conn || conn->sock < 0) > return -1; > > ! retry: > FD_ZERO(&input_mask); > FD_SET(conn->sock, &input_mask); > timeout.tv_sec = 0; > --- 395,401 ---- > if (!conn || conn->sock < 0) > return -1; > > ! retry2: > FD_ZERO(&input_mask); > FD_SET(conn->sock, &input_mask); > timeout.tv_sec = 0; > *************** > *** 405,411 **** > { > if (SOCK_ERRNO == EINTR) > /* Interrupted system call - we'll just try again */ > ! goto retry; > > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > --- 405,411 ---- > { > if (SOCK_ERRNO == EINTR) > /* Interrupted system call - we'll just try again */ > ! goto retry2; > > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > *************** > *** 478,484 **** > } > > /* OK, try to read some data */ > ! tryAgain: > #ifdef USE_SSL > if (conn->ssl) > nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, > --- 478,484 ---- > } > > /* OK, try to read some data */ > ! retry3: > #ifdef USE_SSL > if (conn->ssl) > nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, > *************** > *** 490,496 **** > if (nread < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto tryAgain; > /* Some systems return EAGAIN/EWOULDBLOCK for no data */ > #ifdef EAGAIN > if (SOCK_ERRNO == EAGAIN) > --- 490,496 ---- > if (nread < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto retry3; > /* Some systems return EAGAIN/EWOULDBLOCK for no data */ > #ifdef EAGAIN > if (SOCK_ERRNO == EAGAIN) > *************** > *** 531,537 **** > (conn->inBufSize - conn->inEnd) >= 8192) > { > someread = 1; > ! goto tryAgain; > } > return 1; > } > --- 531,537 ---- > (conn->inBufSize - conn->inEnd) >= 8192) > { > someread = 1; > ! goto retry3; > } > return 1; > } > *************** > *** 564,570 **** > * Still not sure that it's EOF, because some data could have just > * arrived. > */ > ! tryAgain2: > #ifdef USE_SSL > if (conn->ssl) > nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, > --- 564,570 ---- > * Still not sure that it's EOF, because some data could have just > * arrived. > */ > ! retry4: > #ifdef USE_SSL > if (conn->ssl) > nread = SSL_read(conn->ssl, conn->inBuffer + conn->inEnd, > *************** > *** 576,582 **** > if (nread < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto tryAgain2; > /* Some systems return EAGAIN/EWOULDBLOCK for no data */ > #ifdef EAGAIN > if (SOCK_ERRNO == EAGAIN) > --- 576,582 ---- > if (nread < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto retry4; > /* Some systems return EAGAIN/EWOULDBLOCK for no data */ > #ifdef EAGAIN > if (SOCK_ERRNO == EAGAIN) > *************** > *** 804,810 **** > > if (forRead || forWrite) > { > ! retry: > FD_ZERO(&input_mask); > FD_ZERO(&output_mask); > FD_ZERO(&except_mask); > --- 804,810 ---- > > if (forRead || forWrite) > { > ! retry5: > FD_ZERO(&input_mask); > FD_ZERO(&output_mask); > FD_ZERO(&except_mask); > *************** > *** 817,823 **** > (struct timeval *) NULL) < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto retry; > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > SOCK_STRERROR(SOCK_ERRNO)); > --- 817,823 ---- > (struct timeval *) NULL) < 0) > { > if (SOCK_ERRNO == EINTR) > ! goto retry5; > printfPQExpBuffer(&conn->errorMessage, > libpq_gettext("select() failed: %s\n"), > SOCK_STRERROR(SOCK_ERRNO)); > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
pgsql-hackers by date: