Re: copy_from does not stop reading after an error - Mailing list psycopg

From Nicolas Grilly
Subject Re: copy_from does not stop reading after an error
Date
Msg-id AANLkTimSyph9d15GnfmyAOjC=4sMXpTWZbW1taS63ayr@mail.gmail.com
Whole thread Raw
In response to Re: copy_from does not stop reading after an error  (Nicolas Grilly <nicolas@gardentechno.com>)
Responses Re: copy_from does not stop reading after an error  (Marko Kreen <markokr@gmail.com>)
List psycopg
I have tested again the \copy command of psql and, contrary to what I wrote before, it looks like psql does not stop reading after an error either, and has to read the complete file before reporting the error.

Conclusion: psycopg2, pg8000 and psql have the same behavior regarding the command "copy from stdin". The input data file is read entirely, even if there is incorrect data at the start of the file, and errors are reported only after having read the complete file.

Therefore it is probably not a bug in psycopg2, and just a "limitation" of PostgreSQL protocol. Here is the protocol official documentation:

I understand we have to "end" the copy before having a chance to retrieve PostgreSQL backend response and know if our data are correct, or not. Do you confirm this analysis?

It means copy_from is not designed to send a 10 gigabytes stream to PostgreSQL, with just one "copy from stdin" command. Maybe I have to split my input stream into smaller chunks and execute a "copy from stdin" command for each of them. Do you confirm this is the only (and adequate) solution?

Thanks for you help and advice.

PS: I've copied that email to lighthouse for future reference.


On Tue, Feb 1, 2011 at 12:34, Nicolas Grilly <nicolas@gardentechno.com> wrote:
Thank you Federico for your answer.

I have ran the same script with pg8000, and it does not stop reading after an error either... Maybe it is not a bug, and just a limitation of the PostgreSQL protocol? Maybe the copy from protocol is not designed to return errors in the middle of the data stream, and I just have to split my data stream into many chunks and call copy_from for each chunk?

On Tue, Feb 1, 2011 at 11:48, Federico Di Gregorio <federico.digregorio@dndg.it> wrote:
On 01/02/11 11:24, Nicolas Grilly wrote:
> It looks like copy_from does not stop reading after an error. When the
> input file is short, it is okay. But when the input file is very long,
> it is really boring to wait for the entire file to be read just to
> discover there is an error on the 10th row.

Thank you very much. I just used your email to create a bug report on
Lighthouse:

http://psycopg.lighthouseapp.com/projects/62710-psycopg/tickets/37-copy_from-does-not-stop-reading-after-an-error

federico


psycopg by date:

Previous
From: Federico Di Gregorio
Date:
Subject: Re: At FOSDEM
Next
From: Marko Kreen
Date:
Subject: Re: copy_from does not stop reading after an error