I have tested again the \copy command of psql and, contrary to what I wrote before, it looks like psql does not stop reading after an error either, and has to read the complete file before reporting the error.
Conclusion: psycopg2, pg8000 and psql have the same behavior regarding the command "copy from stdin". The input data file is read entirely, even if there is incorrect data at the start of the file, and errors are reported only after having read the complete file.
Therefore it is probably not a bug in psycopg2, and just a "limitation" of PostgreSQL protocol. Here is the protocol official documentation:
I understand we have to "end" the copy before having a chance to retrieve PostgreSQL backend response and know if our data are correct, or not. Do you confirm this analysis?
It means copy_from is not designed to send a 10 gigabytes stream to PostgreSQL, with just one "copy from stdin" command. Maybe I have to split my input stream into smaller chunks and execute a "copy from stdin" command for each of them. Do you confirm this is the only (and adequate) solution?
Thanks for you help and advice.
PS: I've copied that email to lighthouse for future reference.
On Tue, Feb 1, 2011 at 12:34, Nicolas Grilly
<nicolas@gardentechno.com> wrote:
Thank you Federico for your answer.
I have ran the same script with pg8000, and it does not stop reading after an error either... Maybe it is not a bug, and just a limitation of the PostgreSQL protocol? Maybe the copy from protocol is not designed to return errors in the middle of the data stream, and I just have to split my data stream into many chunks and call copy_from for each chunk?