Encoding conversions in psql - Mailing list pgsql-hackers

From Mathijs Brands
Subject Encoding conversions in psql
Date
Msg-id 20040108142119.GA13264@ilse.net
Whole thread Raw
List pgsql-hackers
Howdy,

Can anyone explain to me when psql tries to convert between encodings?
It seems to disregard encodings set with SET CLIENT_ENCODING.

The following reproduces the behaviour I'm seeing:

1. create an UNICODE database

2. run the following:    set client_encoding to latin1;    create table bla(a text);    insert into bla
values('meëep');

3. try the following from psql:    Welcome to psql 7.3.4, the PostgreSQL interactive terminal.        Type:  \copyright
fordistribution terms           \h for help with SQL commands           \? for help on internal slash commands
\g or terminate with semicolon to execute query           \q to quit        mathijs=# select * from bla;       a
-------    meëep    (1 row)        mathijs=# set client_encoding = latin1;    SET    mathijs=# select * from bla;
a   ------     meep    (1 row)        mathijs=# \encoding latin1    mathijs=# select * from bla;       a    -------
meëep   (1 row)
 
After setting CLIENT_ENCODING, the middle character gets dropped. To me
it seems like psql is considering the data it gets from the server as
UTF8, tries to interpret it as UTF8, sees the ë (which is indeed an
invalid UTF8 character) and drops it.

My question is: why does psql seem to think it's receiving UTF8 data
-after- I've changed the client_encoding. I've checked with a network
sniffer that results returned with or without using \encoding (as
expected) are the same. Is this behaviour a bug? If not, it does not
seem very obvious to me; I would expect psql to keep track of the
encoding set between the server and the client.

Cheers,

Mathijs


pgsql-hackers by date:

Previous
From: Potuganti Ramu
Date:
Subject: "with grant option" for user groups.
Next
From: "William ZHANG"
Date:
Subject: Re: OLE DB driver