Escaping from blocked send() reprised. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Escaping from blocked send() reprised.
Date
Msg-id 20140630.171347.220032486.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
Responses Re: Escaping from blocked send() reprised.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,

http://postgresql.1045698.n5.nabble.com/Escaping-a-blocked-sendto-syscall-without-causing-a-restart-td5740855.html

> Subject: Escaping a blocked sendto() syscall without causing a restart

Mr. Tom Lane gave a comment replying it,

> Offhand it looks to me like most signals would kick the backend off the
> send() call ... but it would loop right back and try again.  See
> internal_flush() in pqcomm.c.  (If you're using SSL, this diagnosis
> may or may not apply.)
> 
> We can't do anything except repeat the send attempt if the client
> connection is to be kept in a sane state.
(snipped)
>  And I'm not at all sure if we could get it to work in SSL mode...

That's true for timeouts that should continue the connection,
say, statement_timeout, but focusing on intentional backend
termination, I think it does no harm to break it up abruptly,
even if it was on SSL. On the other hand it seems still
preferable to keep a connection when not blocked. The following
expression would detects such a blocking state at just before
next send(2) after the previous try exited by signals.

(ProcDiePending && select(1, NULL, fd, NULL, '1 sec') == 0)

Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).

Nevetheless, of course statement_timeout cannot become effective
by this method since it breaks the consistency in the client
protocol. It needs change in client protocol to have "out of
band" mechanism or something, maybe.

Any suggestions?



Attached patches are:
 - break_socket_blocking_on_termination_v1.patch : The patch to   break blocked state of send(2) for
pg_terminate_backend().
 - socket_block_test.patch : debug printing and changing send   buffer of libpq for reproducing the blocked situation.



Some point of discussion follows,

==== Discussion about the appropriateness of looking into    ProcDiePending there and calling CHECK_FOR_INTERRUPTS()
seeingit.
 

I have somewhat uneasiness of these things, but what we can at
most seems to be replacing ProcDiePending here with some another
variable, say ImmediatelyExitFromBlockedState, and somehow go
upstairs through normal return path. Additional Try-Catch seems
can do that but it looks no benefit for the added complexity..



==== Discussion on breaking up connetion especially for SSL

Breaking an SSL connection up in my_sock_write() cause the
following message on client side if it still lives and resumes to
receive from the connection, this seems to show that the client
handles the event properly.

| SSL SYSCALL error: EOF detected
| The connection to the server was lost. Attempting reset: Succeeded.



==== Discussion on reliability of select(2)

This method is not a perfect solution, since the select(2)
sometimes gives a wrong oracle about wheather the follwing
send(2) will be blocked.

Even so, as far as I see, select(2) just after exiting from
blocked send(2) by signal seems always says 'write will be
blocked', so what this patch does seems to save most cases except
when the any amount of socket buffer is vacated just before the
following select. The second chance to exit from blocked send(2)
won't come after this(, before one more pg_terminate_backend() ?).

Removing the select(2) from the condition (that is,
CHECK_FOR_INTERRUPTS() is called always ProcDiePending is true)
prevents such a possibility, in exchange for killing 'really
live' connection but IMHO it's no problem on intentional server
termination.

More reliable measure for this would be non-blocking IO but it
changes more of the code.



====  Reproducing the situation.

Another possible question would be about the possibility of such
blocking, or how to reproduce the situation. I found that send(2)
on CentOS6.5 somehow sends successfully, for most cases, the
remaining data at the retry after exiting by signal during being
blocked with buffer full, in spite of no change in environment.

So reproducing the stucked situation is rather difficult on the
server as is. But such situation would be reproduced quite easily
with some cheat, that is, enlarging PQ send buffer, say by ten
times.

Applying the attached testing patch (socket_block_test.patch),
the following steps will make the stucked situation.
1. Do a select which returns big result and enter Ctrl-Z just   after invoking.
  cl> $ psql -h localhost postgres  cl> postgres=# select 1 from generate_series(0, 9999999);  cl> ^Z  cl> [4]+
Stopped                psql -h localhost postgres
 
2. Watch the server to stuck.
 The server starts to print lines like following to console after a while, then stops. The number enclosed by the
squarebrackets is PID of the server.
 
  sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
3. Do pg_terminate_backend().
  cl> $ psql postgres -c "select pg_terminate_backend(8809)"
  The server will stuck like follows, PID=8811 is the another  session made by the command just above.
  sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0  sv> #### [8811] (bare) rest = 0 / 327 bytes,
ProcDiePending= 0  sv> #### [8809] (bare) rest = 500 / 81920 bytes, ProcDiePending = 1  sv> #### [8811] (bare) rest = 0
/78 bytes, ProcDiePending = 0
 
 The server 8809 is blocked during sending the remaining 500 bytes and won't come back forever except for SIGKILL, or
possibledata reading on the client (fg does it).
 
  cl> $ fg
  sv> #### [8809] (bare) rest = 0 / 500 bytes, ProcDiePending = 1  sv> FATAL:  terminating connection due to
administratorcommand  sv> STATEMENT:  select 1 from generate_series(0, 9999999);  sv> #### [8809] (bare) rest = 0 / 116
bytes,ProcDiePending = 0  sv> #### [8883] (bare) rest = 0 / 327 bytes, ProcDiePending = 0
 

 If you don't see the situation to occur, changing the value of select clause (by its length, not by value:) would be
effective,or entering Ctrl-Z after some debug output also would be effective.
 
 For SSL connections, the debug output looks like the following,
  sv> #### [20064] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0  sv> #### [20064] (SSL) rest = 0 / 16413 bytes,
ProcDiePending= 0  sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0  sv> #### [20064] (SSL) rest = 0 /
16413bytes, ProcDiePending = 0  sv> #### [20064] (SSL) rest = 980 / 16413 bytes, ProcDiePending = 1  sv> #### [20064]
(SSL)rest = 0 / 980 bytes, ProcDiePending = 1  sv> #### [20064] (SSL) rest = 1029 / 16413 bytes, ProcDiePending = 1
 
 "bare" here in turn means the status of SSL_write and "SSL" means the status of the underlying 'bare' socket of SSL
connection.(Sorry for the confising labelings..)
 
 The (bare) line above is not corresponding to the following bunch of (SSL) lines, but its precedents. At the fifth
line,send(2) exits by signal issued by pg_teminate_backend() then retry (somehow) successfully at sixth line but SSL
layergave another 16413 bytes and only 1029 bytes of that is sent by the first try and the server stucked at the second
tryfor the seventh line. The control doesn't come back to secure_write() during this sequence.
 


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 59204cf..f01c140 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -76,6 +76,7 @@#include "libpq/libpq.h"#include "tcop/tcopprot.h"
+#include "miscadmin.h"#include "utils/memutils.h"
@@ -323,6 +324,41 @@ rloop:}/*
+ * Although we basically should try to send all data staying in our send
+ * buffer, we also should consider the possibility that hanging of clients or
+ * network cutoff has compelled send(2) to be blokced. We need to be allowed
+ * to exit from send() if such blocking states last for a while during process
+ * termination. Returns true if send blocking is detected.
+ *
+ * The worse side of this would be that extra-slow receiver (specifically
+ * under one packet per second) might fail to recive the result to the end,
+ * but pg_terminate_backend() is such a thing.
+ */
+static bool
+is_write_blocked(int fd)
+{
+#ifndef WIN32
+    fd_set wfds;
+    struct timeval tv;
+    int ret;
+    int save_errno = errno;
+
+    FD_ZERO(&wfds);
+    FD_SET(fd, &wfds);
+    tv.tv_sec = 1;
+    tv.tv_usec = 0;
+    ret = select(fd + 1, NULL, &wfds, NULL, &tv);
+
+    /* The error of select here is safely ignorable. */
+    errno = save_errno;
+
+    return ret <= 0 ? true : false;
+#else
+    return false;
+#endif
+}
+
+/* *    Write data to a secure connection. */ssize_t
@@ -457,6 +493,14 @@ wloop:#endif        n = send(port->sock, ptr, len, 0);
+    /*
+     * Check for interrupt if the socket was blocked during process is
+     * being terminated.
+     */
+    if (ProcDiePending && is_write_blocked(port->sock))
+        CHECK_FOR_INTERRUPTS();
+    
+        return n;}
@@ -515,6 +559,14 @@ my_sock_write(BIO *h, const char *buf, int size)    res = send(h->num, buf, size, 0);
BIO_clear_retry_flags(h);
+
+    /*
+     * Check for interrupt if the socket was blocked during process is being
+     * terminated.
+     */
+    if (ProcDiePending && is_write_blocked(h->num))
+        CHECK_FOR_INTERRUPTS();
+    if (res <= 0)    {        if (errno == EINTR)
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 59204cf..98ee41e 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -75,6 +75,7 @@#endif   /* USE_SSL */#include "libpq/libpq.h"
+#include "miscadmin.h"#include "tcop/tcopprot.h"#include "utils/memutils.h"
@@ -457,6 +458,9 @@ wloop:#endif        n = send(port->sock, ptr, len, 0);
+    fprintf(stderr, "#### [%d] (bare) rest = %d / %d bytes, ProcDiePending = %d\n",
+            getpid(), len - n, len, ProcDiePending);
+    return n;}
@@ -514,6 +518,8 @@ my_sock_write(BIO *h, const char *buf, int size)    int            res = 0;    res = send(h->num,
buf,size, 0);
 
+    fprintf(stderr, "#### [%d] (SSL) rest = %d / %d bytes, ProcDiePending = %d\n",
+            getpid(), size - res, size, ProcDiePending);    BIO_clear_retry_flags(h);    if (res <= 0)    {
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 605d891..259b6ac 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -114,7 +114,7 @@ static List *sock_paths = NIL; * enlarged by pq_putmessage_noblock() if the message doesn't fit
otherwise.*/
 
-#define PQ_SEND_BUFFER_SIZE 8192
+#define PQ_SEND_BUFFER_SIZE (8192 * 10)#define PQ_RECV_BUFFER_SIZE 8192static char *PqSendBuffer;

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: IMPORT FOREIGN SCHEMA statement
Next
From: Craig Ringer
Date:
Subject: Re: How about a proper TEMPORARY TABLESPACE?