Hi all,
Got the following thing : ≠, ≤, and ≥ store in the database as question marks according to one of my developers.
I have postgres installed on both MAC OS X and Centos 7
All locale on both point to UTF8
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
insert into jt1 values ('≤') ;
This I can run either copy/paste which is case 1 (which does reproduce the developer issue) , or have it in an sql
scriptwhich case 2
On OS X:
- case 1 fails
testdb=# insert into jt1 values ('??') ;
ERROR: invalid byte sequence for encoding "UTF8": 0xe2 0xa4 0x27
Note that at paste time ≤ changed in ??
- case 2 is fine
- echo -n '≤' |hexdump -C
00000000 e2 89 a4 |...|
00000003
On Centos:
- Both cases are fine
- echo -n '≤' |hexdump -C
00000000 e2 89 a4 |...|
00000003
http://www.fileformat.info/info/unicode/char/2264/index.htm
UTF-8 (hex) 0xE2 0x89 0xA4 (e289a4)
So to me the representation is fine in all cases. Also in all cases my encoding is UTF8.
I am trying to understand in OS X where does the change occur ? What is causing the failure ?
In the bigger picture a developer complained about this failure and I am fairly sure this is not a postgres issue but I
needto prove it
Many thanks for help
-- Armand