Re: PQputCopyData dont signal error - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: PQputCopyData dont signal error
Date
Msg-id 1391541354.15160.15.camel@jdavis
Whole thread Raw
In response to Re: PQputCopyData dont signal error  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: PQputCopyData dont signal error  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
> On 14.04.2011 10:15, Pavel Stehule wrote:
> > Hello
> >
> > I have a problem with PQputCopyData function. It doesn't signal some error.
> >
> >         while ((row = mysql_fetch_row(res)) != NULL)
> >         {
> >             snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], row[1]);
> >             copy_result = PQputCopyData(pconn, buffer, strlen(buffer));
> >             printf(">>%s<<\n", PQerrorMessage(pconn));
> >             printf("%d\n", copy_result);
> >             if (copy_result != 1)
> >             {
> >                 fprintf(stderr, "Copy to target table failed: %s",
> >                         PQerrorMessage(pconn));
> >                 EXIT;
> >             }
> >         }
> >
> > it returns 1 for broken values too :(
> >
> > Is necessary some special check?
> 
> The way COPY works is that PQputCopyData just sends the data to the 
> server, and the server will buffer it in its internal buffer and 
> processes it when it feels like it. The PQputCopyData() calls don't even 
> need to match line boundaries.
> 
> I think you'll need to send all the data and finish the COPY until you 
> get an error. If you have a lot of data to send, you might want to slice 
> it into multiple COPY statements of say 50MB each, so that you can catch 
> errors in between.

[ replying to old thread ]

According to the protocol docs[1]:

"In the event of a backend-detected error during copy-in mode (including
receipt of a CopyFail message), the backend will issue an ErrorResponse
message. If the COPY command was issued via an extended-query message,
the backend will now discard frontend messages until a Sync message is
received, then it will issue ReadyForQuery and return to normal
processing. If the COPY command was issued in a simple Query message,
the rest of that message is discarded and ReadyForQuery is issued. In
either case, any subsequent CopyData, CopyDone, or CopyFail messages
issued by the frontend will simply be dropped."

If the remaining CopyData messages are dropped, I don't see why
PQputCopyData can't return some kind of error indicating that further
CopyData messages are useless so that it can stop sending them.

Asking the client to break the copy into multiple COPY commands is bad,
because then the client needs to figure out the line breaks, which is a
burden in many cases.

Certainly we don't want to *guarantee* that the backend will issue an
error at any particular point, because of the buffering on the server
side. But from a practical standpoint, the server will let the client
know fairly quickly and it will avoid a lot of client-side work and
network traffic.

Would a change to PQputCopyData be welcome?

Regards,Jeff Davis

[1] http://www.postgresql.org/docs/9.3/static/protocol-flow.html






pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: nested hstore - large insert crashes server
Next
From: Stephen Frost
Date:
Subject: Re: nested hstore - large insert crashes server