BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252 - Mailing list pgsql-bugs

From sams.james+postgres@gmail.com
Subject BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252
Date
Msg-id E1TOjc5-0001lZ-5m@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      7611
Logged by:          James
Email address:      sams.james+postgres@gmail.com
PostgreSQL version: 9.1.6
Operating system:   Ubuntu Linux 12.04
Description:        =


I have a file with several nul characters in it. The file itself appears to
be encoded as windows-1252, though I am not 100% certain of that. I do know
that other software (e.g. Python) can decode the data as windows-1252
without issue. Postgres's \copy, however, chokes on the nul byte:

ERROR:  unterminated CSV quoted field
CONTEXT:  COPY promo_nonactive_load_fake, line 239900

Note that the error is wrong, the field is quoted but postgres seems to jump
forward in the file when it encounters the nul bytes.

Further, the line number is wrong. That is the length of the file (in
lines), not the line on which the error occurs, which is several hundred
lines before this.

Deleting the nul byte characters allowed copy to proceed normally. I
experienced similar issues with psycopg2 and copy_expert using COPY FROM
STDIN and this file.

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: BUG #7534: walreceiver takes long time to detect n/w breakdown
Next
From: Tom Lane
Date:
Subject: Re: BUG #7611: \copy (and COPY?) incorrectly parses nul character for windows-1252