Re: accented characters migraine - Mailing list pgsql-novice
From | John Gunther |
---|---|
Subject | Re: accented characters migraine |
Date | |
Msg-id | 470FB36D.5020205@bucksvsbytes.com Whole thread Raw |
In response to | Re: accented characters migraine ("Wright, George" <George.Wright@infimatic.com>) |
List | pgsql-novice |
Seems to have done the trick this time. When I tried that earlier the only difference was that accented characters displayed as gray rectangles. It was boneheaded. Thanks. Wright, George wrote: > Putty is showing ISO-8858-1 which is Latin. I believe both client and server must be UTF-8. > > > > -----Original Message----- > From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org] On Behalf Of John Gunther > Sent: Friday, October 12, 2007 11:59 AM > To: pgsql-novice@postgresql.org > Subject: [NOVICE] accented characters migraine > > It seems to me this ought to be simple and clearly documented but I've > spent hours researching and experimenting to no avail. > > PROBLEM: Entering accented characters in psql often results in the > error: invalid byte sequence for encoding "UTF8" > > ENVIRONMENT: > Client OS: Windows XP > Keyboard: United States-International > Terminal program: putty.exe, Translation: ISO-8859-1:1998 (Latin-1, West > Europe) > Server OS: Ubuntu > Server client app: psql 8.2.4 > Server db app: PostgreSQL 8.2.4 > pg settings: > client_encoding: UTF8 > lc_collate: en_US.UTF-8 > lc_ctype: en_US.UTF-8 > server_encoding UTF8 > > initdb defaulted to UTF-8, which I need because I want ORDER BY to sort > alphabetically, not by hex code. > > When I try to insert a string with an accented character, I generally > get the above error. Simple example: > template1=# \d sorttest > id | integer > test | text > > template1=# insert into sorttest (test) values ('ã'); > ERROR: invalid byte sequence for encoding "UTF8": 0xe32729 > HINT: This error can also happen if the byte sequence does not match > the encoding expected by the server, which is controlled by > "client_encoding". > > The accented character (a-tilde) is entered from the Windows keyboard > with the ~a sequence and displays properly in psql. The problem is that > the server rejects it. > Observations: > 1) The Unicode hex value of a-tilde is C3 A3 but the error message says > the invalid sequence is E3 27 29. I don't know what the first byte means > but the second and third are the quote and right parenthesis characters > following the a-tilde in my insert statement. > 2) At various times, data entry as above has started working in a > session but I can't figure out what I did to make it happen. > 3) I tried entering the character in hex, as I understand it: insert > into sorttest (test) values (E'\xc3\xa3'); > This avoids the error but the string value then displays as the 2 > seemingly irrelevant characters ã (A-tilde, British pound) > > It looks like I'm caught in some interaction between putty, psql and pg. > The real problem is much more grave than just manual data entry-- I'm > trying to migrate a large existing database from another pg server with: > pg_dumpall -h nnn.nnn.nnn.nnn | psql > This throws errors each time the COPY commands encounter an accented > character in the dump. > > Any ideas? Is this just a bonehead mistake on my part? > > John > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > > >
pgsql-novice by date: