Hi,
I have a postgresql 7.4 server and client on Centos 6.4. The database server is using UTF-8 encoding.
I have been exploring the use of the \copy command for importing CSV data generated by SQL Server 2008. SQL Server 2008 export tool does not escape quotes that are in the content of fields, and so it is useful to be able to specify obscure characters in the quote option in the \copy command to work around this issue.
When I run the following commands in psql, I am surprised that QUOTE is limited to characters in the range 0x01 - 0x7f, and that UTF8 is mentioned in the error message if characters outside the range are chosen:
\encoding WIN1252
\copy yuml from '/tmp/yuml.csv' WITH CSV HEADER ENCODING 'WIN1252' QUOTE as E'\xff';
ERROR: invalid byte sequence for encoding "UTF8": 0xff
I thought that if the client (psql) is WIN1252, and the CSV file is specified as WIN1252, then I could specify any valid WIN1252 character as the quote character. Instead, I am limited to the range of characters that can be encoded as a single byte in UTF-8. Actually, 0x00 is not accepted either, so the range is 0x01 - 0x7F.
Is this a bug or expected behaviour ?
Is it the case that the server does the actual CSV parsing, and that given that my server is in UTF8, I am therefore limited to single-byte UTF8 characters ?
regards,
Martin