GSSENC'ed connection stalls while reconnection attempts. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject GSSENC'ed connection stalls while reconnection attempts.
Date
Msg-id 20200710.173803.435804731896516388.horikyota.ntt@gmail.com
Whole thread Raw
Responses Re: GSSENC'ed connection stalls while reconnection attempts.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello.

If psql connected using GSSAPI auth and server restarted, reconnection
sequence stalls and won't return.

I found that psql(libpq) sends startup packet via gss
encryption. conn->gssenc should be reset when encryption state is
freed.

The reason that psql doesn't notice the error is pqPacketSend returns
STATUS_OK when write error occurred.  That behavior contradicts to the
comment of the function. The function is used only while making
connection so it's ok to error out immediately by write failure.  I
think other usage of pqFlush while making a connection should report
write failure the same way.

Finally, It's user-friendly if psql shows error message for error on
reset attempts. (This perhaps should be arguable.)

The attached does the above. Any thoughts and/or opinions are welcome.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 8cc5663f31744b384cf013f86062ea28b431baa6 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Fri, 10 Jul 2020 17:30:11 +0900
Subject: [PATCH] Fix reconnection failure of GSSENC connections

A connection using GSS encryption fails to reconnect and freezes. Fix
that by resetting GSS encryption state on dropping a connection. On
the way fixing that, fix the behavior for write failure while
connection attempts.  Also let psql report the cause of reset attempt
failure.
---
 src/bin/psql/common.c             |  2 +-
 src/interfaces/libpq/fe-auth.c    |  7 ++++++-
 src/interfaces/libpq/fe-connect.c | 13 ++++++++++---
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 6323a35c91..d5ab1d8ada 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -310,7 +310,7 @@ CheckConnection(void)
         OK = ConnectionUp();
         if (!OK)
         {
-            fprintf(stderr, _("Failed.\n"));
+            fprintf(stderr, _("Failed: %s\n"), PQerrorMessage(pset.db));
 
             /*
              * Transition to having no connection.  Keep this bit in sync with
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 9f5403d74c..3664ee10a0 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -588,7 +588,12 @@ pg_SASL_init(PGconn *conn, int payloadlen)
     }
     if (pqPutMsgEnd(conn))
         goto error;
-    if (pqFlush(conn))
+
+    /*
+     * We must not fail write here.  Write failure needs to be checked
+     * explicitly. See pqSendSome.
+     */
+    if (pqFlush(conn) || conn->write_failed)
         goto error;
 
     termPQExpBuffer(&mechanism_buf);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27c9bb46ee..f897003f42 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -496,6 +496,8 @@ pqDropConnection(PGconn *conn, bool flushInput)
             free(conn->gss_ResultBuffer);
             conn->gss_ResultBuffer = NULL;
         }
+
+        conn->gssenc = false;
     }
 #endif
 #ifdef ENABLE_SSPI
@@ -3455,8 +3457,10 @@ keep_going:                        /* We will come back to here until there is
                  * Just make sure that any data sent by pg_fe_sendauth is
                  * flushed out.  Although this theoretically could block, it
                  * really shouldn't since we don't send large auth responses.
+                 * Write failure needs to be checked explicitly. See
+                 * pqSendSome.
                  */
-                if (pqFlush(conn))
+                if (pqFlush(conn) || conn->write_failed)
                     goto error_return;
 
                 if (areq == AUTH_REQ_OK)
@@ -4546,8 +4550,11 @@ pqPacketSend(PGconn *conn, char pack_type,
     if (pqPutMsgEnd(conn))
         return STATUS_ERROR;
 
-    /* Flush to ensure backend gets it. */
-    if (pqFlush(conn))
+    /*
+     * Flush to ensure backend gets it. Write failure needs to be checked
+     * explicitly. See pqSendSome.
+     */
+    if (pqFlush(conn) || conn->write_failed)
         return STATUS_ERROR;
 
     return STATUS_OK;
-- 
2.18.4


pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: Creating a function for exposing memory usage of backend process
Next
From: Masahiko Sawada
Date:
Subject: Re: WIP: BRIN multi-range indexes