Re: libpq copy error handling busted - Mailing list pgsql-hackers

From Andres Freund
Subject Re: libpq copy error handling busted
Date
Msg-id 20200606043004.xyo52casyaman7sq@alap3.anarazel.de
Whole thread Raw
In response to Re: libpq copy error handling busted  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: libpq copy error handling busted
List pgsql-hackers
Hi,

On 2020-06-03 18:41:28 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > When libpq is used to COPY data to the server, it doesn't properly
> > handle errors.
> > This is partially an old problem, and partially got recently
> > worse. Before the below commit we detected terminated connections, but
> > we didn't handle copy failing.
> 
> Yeah.  After poking at that for a little bit, there are at least three
> problems:
> 
> * pqSendSome() is responsible not only for pushing out data, but for
> calling pqReadData in any situation where it can't get rid of the data
> promptly.  1f39a1c06 overlooked that requirement, and the upshot is
> that we don't necessarily notice that the connection is broken (it's
> pqReadData's job to detect that).  Putting a pqReadData call into
> the early-exit path helps, but doesn't fix things completely.

Is that fully necessary? Couldn't we handle at least the case I had by
looking at write_failed in additional places?

It might still be the right thing to continue to call pqReadData() from
pqSendSome(), don't get me wrong.


> * The more longstanding problem is that the PQputCopyData code path
> doesn't have any mechanism for consuming an 'E' (error) message
> once pqReadData has collected it.  AFAICS that's ancient.

Yea, I looked back quite a bit, and it looked that way for a long
time. I thought for a moment that it might be related to the copy-both
introduction, but it wasn't.


> I think that the idea was to let the client dump all its copy data and
> then report the error message when PQendCopy is called, but as you
> say, that's none too friendly when there's gigabytes of data involved.
> I'm not sure we can do anything about this without effectively
> changing the client API for copy errors, though.

Hm. Why would it *really* be an API change? Until recently connection
failures etc were returned from PQputCopyData(), and it is documented
that way:

/*
 * PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
 *
 * Returns 1 if successful, 0 if data could not be sent (only possible
 * in nonblock mode), or -1 if an error occurs.
 */
int
PQputCopyData(PGconn *conn, const char *buffer, int nbytes)

So consuming 'E' when in copy mode doesn't seem like a crazy change to
me. Probably a bit too big to backpatch though. But given that this
hasn't been complained about much in however many years...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: hashagg slowdown due to spill changes
Next
From: Thomas Munro
Date:
Subject: Vacuuming the operating system documentation