Re: Unhappy with error handling in psql's handleCopyOut() - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Unhappy with error handling in psql's handleCopyOut()
Date
Msg-id 20140211205336.GU2921@tamriel.snowman.net
Whole thread Raw
In response to Unhappy with error handling in psql's handleCopyOut()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Unhappy with error handling in psql's handleCopyOut()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> While looking at the pending patch to make psql report a line count
> after COPY, I came across this business in handleCopyOut():
>
>      * Check command status and return to normal libpq state.  After a
>      * client-side error, the server will remain ready to deliver data.  The
>      * cleanest thing is to fully drain and discard that data.  If the
>      * client-side error happened early in a large file, this takes a long
>      * time.  Instead, take advantage of the fact that PQexec() will silently
>      * end any ongoing PGRES_COPY_OUT state.  This does cause us to lose the
>      * results of any commands following the COPY in a single command string.
>      * It also only works for protocol version 3.  XXX should we clean up
>      * using the slow way when the connection is using protocol version 2?
>
> which git blames on commit 08146775 (committed by Alvaro on behalf of
> Noah).
>
> This does not make me happy.  In the first place, we have not dropped
> support for protocol version 2.  In the second place, I fail to see
> what the advantage is of kluging things like this.  The main costs of
> draining the remaining COPY data are the server-side work of generating
> the data and the network transmission costs, neither of which will go
> away with this technique.  So I'm thinking we should revert this kluge
> and just drain the data straightforwardly, which would also eliminate
> the mentioned edge-case misbehavior when there were more commands in
> the query string.
>
> Is there a reason I'm overlooking not to do this?

I've not gotten back to it yet, but I ran into a related-seeming issue
where psql would happily chew up 2G of memory trying to send "COPY
failed" notices when it gets disconnected from a server that it's trying
to send data to mid-COPY.  conn->sock was -1, connection was
'CONNECTION_BAD', but the loop towards the end of handleCopyIn doesn't
care and nothing in libpq is changing PQresultStatus():
   /*    * Check command status and return to normal libpq state    *    * We must not ever return with the status
stillPGRES_COPY_IN.  Our    * caller is unable to distinguish that situation from reaching the next    * COPY in a
commandstring that happened to contain two consecutive COPY    * FROM STDIN commands.  XXX if something makes
PQputCopyEnd()fail    * indefinitely while retaining status PGRES_COPY_IN, we get an infinite    * loop.  This is more
realisticthan handleCopyOut()'s counterpart risk.    */   while (res = PQgetResult(conn), PQresultStatus(res) ==
PGRES_COPY_IN)  {         OK = false;       PQclear(res); 
       PQputCopyEnd(pset.db, _("trying to exit copy mode"));   }

And so it would just keep looping, first building up the 2G of 'copy
failed' notices from the PQputCopyEnd, and then just spinning forever
because it couldn't drain the queue.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Unhappy with error handling in psql's handleCopyOut()
Next
From: Peter Eisentraut
Date:
Subject: Re: Patch: compiling the docs under Gentoo