Re: Chinese in Postgres - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Chinese in Postgres
Date
Msg-id CAM3SWZTv8Zn7EUuQfV9p19a_5pDD+u4+0FiSzLmcTmRy0LNgPw@mail.gmail.com
Whole thread Raw
In response to Chinese in Postgres  ("ciifrancesco@tiscali.it" <ciifrancesco@tiscali.it>)
List pgsql-hackers
On Fri, Aug 16, 2013 at 4:25 AM, ciifrancesco@tiscali.it
<ciifrancesco@tiscali.it> wrote:
> If I insert the data using a C++ program I have empty squares, in this
> format: ��� (3 empty squares for each chinese ideogram as that is the length
> in UTF-8)
> If the string contains chinese mixed with ASCII, the ASCII is OK but the
> Chinese is broken:
> 漢語1-3漢語  --> ������1-3������

You mentioned nothing about what platform this is or how you've built
the program, and nothing about operating system locale.

If this is a Windows program (you mention PuTTY), I'd read up on
differences between what are known as "Unicode" and "Multibyte"
encodings on MSDN:

http://msdn.microsoft.com/en-us/library/2dax2h36.aspx

Of course, this is a total stab in the dark, but then people with the
problem that you describe don't tend to be on *nix systems as a rule.
As someone said upthread, if Postgres does that then it's because the
bytes you sent aren't what you think the are when rendered as UTF-8.

--
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Fix Windows socket error checking for MinGW
Next
From: Noah Misch
Date:
Subject: Re: Fix Windows socket error checking for MinGW