Thread: Trying to understand encoding.

Trying to understand encoding.

From
Tomás Di Doménico
Date:
Greetings.

I'm currently using 8.3, but I've been coping with this since previous
versions.

I'm trying to integrate some LATIN1 and some UTF8 DBs into a single UTF8
one. To avoid the "Invalid UNICODE character..." error, I used iconv to
convert the LATIN1 dumps to UTF8.

Now I have the data into the UTF8 DB, and using graphical clients
everything seems to be great. The thing is, when I query the data via
psql, with \encoding UTF8 I get weird data ("Neuquén" for "Neuquén").
However, with \encoding LATIN1, everything looks fine.

So, I have a UTF8 DB, (what I think is) UTF8 data, and I can only see it
right by setting \encoding to LATIN1 in psql, or using a graphical client.

If anyone could help me try and understand this mess, I'd really
appreciate it.

Ah, these are my locale settings, in case it helps.

LANG=en_US.UTF-8
LC_CTYPE=C
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"

Re: Trying to understand encoding.

From
"Douglas McNaught"
Date:
On 2/15/08, Tomás Di Doménico <tdidomenico@avature.net> wrote:

>  Now I have the data into the UTF8 DB, and using graphical clients
>  everything seems to be great. The thing is, when I query the data via
>  psql, with \encoding UTF8 I get weird data ("NeuquÃ(c)n" for "Neuquén").
>  However, with \encoding LATIN1, everything looks fine.

Maybe your terminal program doesn't support UTF8, or it's
misconfigured?  If you create a UTF8-encoded file and 'cat' it, is the
output correct?

-Doug

Re: Trying to understand encoding.

From
Tomás Di Doménico
Date:
Geez. My default terminal didn't support UNICODE. Shame on me :P

Thanks!

Douglas McNaught wrote:
> On 2/15/08, Tomás Di Doménico <tdidomenico@avature.net> wrote:
>
>>  Now I have the data into the UTF8 DB, and using graphical clients
>>  everything seems to be great. The thing is, when I query the data via
>>  psql, with \encoding UTF8 I get weird data ("NeuquÃ(c)n" for "Neuquén").
>>  However, with \encoding LATIN1, everything looks fine.
>
> Maybe your terminal program doesn't support UTF8, or it's
> misconfigured?  If you create a UTF8-encoded file and 'cat' it, is the
> output correct?
>
> -Doug
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/
>