Re: Importing data - possible UTF8 import bug? - Mailing list pgsql-admin

From Mikel Lindsaar
Subject Re: Importing data - possible UTF8 import bug?
Date
Msg-id 57a815bf0807110137j198cb9e9gcb7d07405c42f42e@mail.gmail.com
Whole thread Raw
In response to Importing data - possible UTF8 import bug?  ("Mikel Lindsaar" <raasdnil@gmail.com>)
List pgsql-admin
OK, I'm mailing the list the results of my problem so future people can find it.

The error was

ERROR: invalid byte sequence for encoding "UTF8": 0xa2

with many different types of 0x... lines.

The problem was indeed a bug, but one that sat between the keyboard
and screen (that is, me), not with the COPY command.  I didn't read
the COPY docs well enough, in there it clearly states that a backslash
followed by digits will be interpreted as a character with that
numeric code (in the table).

As the data I was importing contained addresses, it had a unit number
and street number, like this; 2\554, so this was being interpreted as
the number 2 followed by a character represented by \554 which was an
invalid sequence and so rightly so, Copy failed and complained about
an invalid char sequence.

Going through the data set and replacing the backslashes with forward
slashes (which works in my case) or if you need to be non destructive,
replcaing the single backslash with a double backslash, handles the
problem.

Sorry all for the noise.

Mikel


--
http://lindsaar.net/
Rails, RSpec and Life blog....

pgsql-admin by date:

Previous
From: "neo3 matrix"
Date:
Subject: Database backup problem.........
Next
From: "niall el-assaad"
Date:
Subject: Password recommendations for an appliance