Thread: BUG #13932: German ß not a valid character in psql
The following bug has been logged on the website: Bug reference: 13932 Logged by: Burkhardt Renz Email address: Burkhardt.Renz@mni.thm.de PostgreSQL version: 9.5.0 Operating system: Mac OS X Description: Entering the german character 'ß' results in ERROR: invalid byte sequence for encoding "UTF8": 0xc3 0x77
On 2/7/2016 3:14 AM, Burkhardt.Renz@mni.thm.de wrote:
Entering the german character 'ß' results in ERROR: invalid byte sequence for encoding "UTF8": 0xc3 0x77
C3 77 is not a valid UTF 8 code point.
I believe 'ß' is 0xC3 0x9F, LATIN SMALL LETTER SHARP S
-- john r pierce, recycling bits in santa cruz
That’s right. But if I enter ß on the keyboard or from the clipboard, psql takes this as 0x03 0x77 instead of 0xc3 0x97. — Burkhardt > Am 07.02.2016 um 12:59 schrieb John R Pierce <pierce@hogranch.com>: > > On 2/7/2016 3:14 AM, Burkhardt.Renz@mni.thm.de wrote: >> Entering the german character 'ß' results in >> ERROR: invalid byte sequence for encoding "UTF8": 0xc3 0x77 >> > > C3 77 is not a valid UTF 8 code point. > > I believe 'ß' is 0xC3 0x9F, LATIN SMALL LETTER SHARP S > > > -- > john r pierce, recycling bits in santa cruz >
On 2/7/2016 4:06 AM, Burkhardt Renz wrote: > That’s right. > But if I enter ß on the keyboard or from the clipboard, > psql takes this as 0x03 0x77 instead of 0xc3 0x97. that must be a mac osx or something because thats sure not a valid UTF8 code. pierce=# select 'ß', encode('ß', 'hex'); ?column? | encode ----------+-------- ß | c39f (1 row) pierce=# show client_encoding; client_encoding ----------------- UTF8 (1 row) and in linux shell, $ locale charmap UTF-8 -- john r pierce, recycling bits in santa cruz
Re: [BUGS] Re: [BUGS] BUG #13932: German ß not a valid character in psql
From
Francisco Olarte
Date:
On Sun, Feb 7, 2016 at 1:06 PM, Burkhardt Renz <Burkhardt.Renz@mni.thm.de> wrote: > But if I enter ß on the keyboard or from the clipboard, > psql takes this as 0x03 0x77 instead of 0xc3 0x97. Could you try to send it to od ( I think Mac OSX should have it or something similar ) to rule out an encoding problem in MacOSX side? I mean something like this, done in linux, hitting <AltGr>+S, Enter, <Ctrl>+D on a spanish keyboard: folarte@paqueton:~/tmp$ od -tx1 ß 0000000 c3 9f 0a 0000003 ( results are the same with the clipboard and the selection ) Francisco Olarte.
On 2/7/2016 4:06 AM, Burkhardt Renz wrote:
But if I enter ß on the keyboard or from the clipboard, psql takes this as 0x03 0x77 instead of 0xc3 0x97.
the error you showed before, you said C3 77, not 03 77. and, its supposed to be C3 9F not C3 97. C3 97 is the multiplication symbol, '×'
-- john r pierce, recycling bits in santa cruz
Encoding on Mac is okay: od -tx1 ß 0000000 c3 9f 0a 0000003 — Burkhardt Renz > Am 07.02.2016 um 13:40 schrieb Francisco Olarte <folarte@peoplecall.com>: > > On Sun, Feb 7, 2016 at 1:06 PM, Burkhardt Renz > <Burkhardt.Renz@mni.thm.de> wrote: >> But if I enter ß on the keyboard or from the clipboard, >> psql takes this as 0x03 0x77 instead of 0xc3 0x97. > > Could you try to send it to od ( I think Mac OSX should have it or > something similar ) to rule out an encoding problem in MacOSX side? > > I mean something like this, done in linux, hitting <AltGr>+S, Enter, > <Ctrl>+D on a spanish keyboard: > > folarte@paqueton:~/tmp$ od -tx1 > ß > 0000000 c3 9f 0a > 0000003 > > ( results are the same with the clipboard and the selection ) > > Francisco Olarte.
Burkhardt Renz <Burkhardt.Renz@mni.thm.de> writes: > Encoding on Mac is okay: > od -tx1 > ß Works for me using a Terminal window in Yosemite: pro:~ tgl$ export LANG=de_DE.UTF-8 pro:~ tgl$ locale charmap UTF-8 pro:~ tgl$ psql regression psql (9.6devel) Type "help" for help. regression=# create database de encoding 'utf8' lc_collate 'de_DE.UTF-8' lc_ctype 'de_DE.UTF-8' template template0; CREATE DATABASE regression=# \c de You are now connected to database "de" as user "tgl". de=# show client_encoding ; client_encoding ----------------- UTF8 (1 row) de=# show server_encoding ; server_encoding ----------------- UTF8 (1 row) de=# select 'ß'; -- made this by typing option-s ?column? ---------- ß (1 row) de=# select 'ß'::bytea; bytea -------- \xc39f (1 row) I surmise that you have wrong values for one or another of the settings mentioned above. regards, tom lane