Re: Unhappy with error handling in psql's handleCopyOut() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Unhappy with error handling in psql's handleCopyOut()
Date
Msg-id 5035.1392169134@sss.pgh.pa.us
Whole thread Raw
In response to Re: Unhappy with error handling in psql's handleCopyOut()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Unhappy with error handling in psql's handleCopyOut()  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
I wrote:
>> Stephen Frost <sfrost@snowman.net> writes:
>>> I've not gotten back to it yet, but I ran into a related-seeming issue
>>> where psql would happily chew up 2G of memory trying to send "COPY
>>> failed" notices when it gets disconnected from a server that it's trying
>>> to send data to mid-COPY.  conn->sock was -1, connection was
>>> 'CONNECTION_BAD', but the loop towards the end of handleCopyIn doesn't
>>> care and nothing in libpq is changing PQresultStatus():

> After some study of the code I have a theory about this.

I was able to reproduce this misbehavior by setting a gdb breakpoint
at pqReadData and then killing the connected server process while psql's
COPY IN was stopped there.  Resetting outCount to zero in the
socket-already-gone case in pqSendSome is enough to fix the problem.
However, I think it's also prudent to hack PQgetResult so that it
won't return a "copy in progress" status if the connection is known
dead.

The error recovery behavior in pqSendSome has been like this since 8.1
or thereabouts, so I'm inclined to push something like the attached into
all branches.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 5f371b4..764e90c 100644
*** a/src/interfaces/libpq/fe-exec.c
--- b/src/interfaces/libpq/fe-exec.c
*************** static int PQsendQueryGuts(PGconn *conn,
*** 63,68 ****
--- 63,69 ----
                  const int *paramFormats,
                  int resultFormat);
  static void parseInput(PGconn *conn);
+ static PGresult *getCopyResult(PGconn *conn, ExecStatusType copytype);
  static bool PQexecStart(PGconn *conn);
  static PGresult *PQexecFinish(PGconn *conn);
  static int PQsendDescribe(PGconn *conn, char desc_type,
*************** PQgetResult(PGconn *conn)
*** 1734,1755 ****
              conn->asyncStatus = PGASYNC_BUSY;
              break;
          case PGASYNC_COPY_IN:
!             if (conn->result && conn->result->resultStatus == PGRES_COPY_IN)
!                 res = pqPrepareAsyncResult(conn);
!             else
!                 res = PQmakeEmptyPGresult(conn, PGRES_COPY_IN);
              break;
          case PGASYNC_COPY_OUT:
!             if (conn->result && conn->result->resultStatus == PGRES_COPY_OUT)
!                 res = pqPrepareAsyncResult(conn);
!             else
!                 res = PQmakeEmptyPGresult(conn, PGRES_COPY_OUT);
              break;
          case PGASYNC_COPY_BOTH:
!             if (conn->result && conn->result->resultStatus == PGRES_COPY_BOTH)
!                 res = pqPrepareAsyncResult(conn);
!             else
!                 res = PQmakeEmptyPGresult(conn, PGRES_COPY_BOTH);
              break;
          default:
              printfPQExpBuffer(&conn->errorMessage,
--- 1735,1747 ----
              conn->asyncStatus = PGASYNC_BUSY;
              break;
          case PGASYNC_COPY_IN:
!             res = getCopyResult(conn, PGRES_COPY_IN);
              break;
          case PGASYNC_COPY_OUT:
!             res = getCopyResult(conn, PGRES_COPY_OUT);
              break;
          case PGASYNC_COPY_BOTH:
!             res = getCopyResult(conn, PGRES_COPY_BOTH);
              break;
          default:
              printfPQExpBuffer(&conn->errorMessage,
*************** PQgetResult(PGconn *conn)
*** 1786,1791 ****
--- 1778,1813 ----
      return res;
  }

+ /*
+  * getCopyResult
+  *      Helper for PQgetResult: generate result for COPY-in-progress cases
+  */
+ static PGresult *
+ getCopyResult(PGconn *conn, ExecStatusType copytype)
+ {
+     /*
+      * If the server connection has been lost, don't pretend everything is
+      * hunky-dory; instead return a PGRES_FATAL_ERROR result, and reset the
+      * asyncStatus to idle (corresponding to what we'd do if we'd detected I/O
+      * error in the earlier steps in PQgetResult).  The text returned in the
+      * result is whatever is in conn->errorMessage; we expect that was filled
+      * with something useful when the lost connection was detected.
+      */
+     if (conn->status != CONNECTION_OK)
+     {
+         pqSaveErrorResult(conn);
+         conn->asyncStatus = PGASYNC_IDLE;
+         return pqPrepareAsyncResult(conn);
+     }
+
+     /* If we have an async result for the COPY, return that */
+     if (conn->result && conn->result->resultStatus == copytype)
+         return pqPrepareAsyncResult(conn);
+
+     /* Otherwise, invent a suitable PGresult */
+     return PQmakeEmptyPGresult(conn, copytype);
+ }
+

  /*
   * PQexec
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 1eb3ac6..a7afd42 100644
*** a/src/interfaces/libpq/fe-misc.c
--- b/src/interfaces/libpq/fe-misc.c
*************** pqSendSome(PGconn *conn, int len)
*** 804,809 ****
--- 804,811 ----
      {
          printfPQExpBuffer(&conn->errorMessage,
                            libpq_gettext("connection not open\n"));
+         /* Discard queued data; no chance it'll ever be sent */
+         conn->outCount = 0;
          return -1;
      }


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: narwhal and PGDLLIMPORT
Next
From: Tom Lane
Date:
Subject: Re: narwhal and PGDLLIMPORT