On 08/07/10 17:42, Alban Hertroys wrote:
> On 8 Jul 2010, at 4:21, Craig Ringer wrote:
>
>> Yes, that's ancient. It is handled quite happily by \copy in csv mode,
>> except that when csv mode is active, \xnn escapes do not seem to be
>> processed. So I can have *either* \xnn escape processing *or* csv-style
>> input processing.
>>
>> Anyone know of a way to get escape processing in csv mode?
>
>
> And what do those hex-escaped bytes mean? Are they in text strings? AFAIK CSV doesn't contain any information about
whatencoding was used to create it, so it could be about anything; UTF-8, Win1252, ISO-8859-something, or whatever
Sybasewas using.
>
> I'm just saying, be careful what you're parsing there ;)
Thanks for that. In this case, the escapes are just "bytes" - what's
important is that, after unescaping, the CSV data is interpreted as
latin-1. OK, Windows-1252, but close enough.
In the end Python's csv module did the trick. I just pulled in the CSV
data, and spat out Postgresql-friendly COPY format so that I didn't need
to use the COPY ... CSV modifier and Pg would interpret the escapes
during input.
In case anyone else needs to deal with this format, here's the program I
used.
--
Craig Ringer
Tech-related writing: http://soapyfrogs.blogspot.com/