Thread: BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252

BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252

From
sams.james+postgres@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      7611
Logged by:          James
Email address:      sams.james+postgres@gmail.com
PostgreSQL version: 9.1.6
Operating system:   Ubuntu Linux 12.04
Description:        =


I have a file with several nul characters in it. The file itself appears to
be encoded as windows-1252, though I am not 100% certain of that. I do know
that other software (e.g. Python) can decode the data as windows-1252
without issue. Postgres's \copy, however, chokes on the nul byte:

ERROR:  unterminated CSV quoted field
CONTEXT:  COPY promo_nonactive_load_fake, line 239900

Note that the error is wrong, the field is quoted but postgres seems to jump
forward in the file when it encounters the nul bytes.

Further, the line number is wrong. That is the length of the file (in
lines), not the line on which the error occurs, which is several hundred
lines before this.

Deleting the nul byte characters allowed copy to proceed normally. I
experienced similar issues with psycopg2 and copy_expert using COPY FROM
STDIN and this file.
sams.james+postgres@gmail.com writes:
> I have a file with several nul characters in it. The file itself appears to
> be encoded as windows-1252, though I am not 100% certain of that. I do know
> that other software (e.g. Python) can decode the data as windows-1252
> without issue. Postgres's \copy, however, chokes on the nul byte:

> ERROR:  unterminated CSV quoted field
> CONTEXT:  COPY promo_nonactive_load_fake, line 239900

Postgres doesn't support nul characters in data, so the best you could
hope for here is an error message anyway.  It looks to me like the
immediate cause of this is that \copy reads the file with fgets()
which will effectively ignore the rest of the line after a nul byte.
But there are probably more issues downstream.

            regards, tom lane