Re: BUG #4257: about unicode extend - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #4257: about unicode extend
Date
Msg-id 19196.1214063721@sss.pgh.pa.us
Whole thread Raw
In response to BUG #4257: about unicode extend  ("arli weng" <program@163.com>)
List pgsql-bugs
"arli weng" <program@163.com> writes:
> the command (chinese by utf-8):
> INSERT INTO "title" VALUES(46307243,46307898,'酋鼠𪕨');
> in postgres report error:
> invalid byte sequence for encoding "UNICODE": 0xf0

I don't believe this is actually an 8.3 server.  In 8.1 or later that
encoding would be referred to as "UTF8"; also, 8.1 and later would show
all bytes of the complained-of character not just the first one.

8.0 and before only support 16-bit Unicode code points (ie, 3-byte
utf8 sequences).  We have support for 4-byte sequences in 8.1 and
later.  Also, there were some fixes in this area in Jan 2007, so
whichever branch you use, make sure you get a minor release that's
newer than that.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "arli weng"
Date:
Subject: BUG #4257: about unicode extend
Next
From: Michael Fuhr
Date:
Subject: Re: BUG #4257: about unicode extend