Re: About Unicode IVS - Mailing list pgsql-admin

From Tom Lane
Subject Re: About Unicode IVS
Date
Msg-id 1107956.1648549556@sss.pgh.pa.us
Whole thread Raw
In response to Re: About Unicode IVS  (Holger Jakobs <holger@jakobs.com>)
Responses RE: About Unicode IVS  (荒井元成 <n2029@ndensan.co.jp>)
List pgsql-admin
Holger Jakobs <holger@jakobs.com> writes:
> It's totally correct that the two characters are still two characters.
> You would have to normalize the string first, so that the combination 
> becomes one character.

Yeah.  In principle the normalize() function ought to do this for
you.  But it doesn't seem to shorten the given example for me;
I'm not sure if that means the example is incorrect, or if it's
a bug in normalize().

u8=# select octet_length(U&'\+008FBA' || U&'\+0E0102');
 octet_length 
--------------
            7
(1 row)

u8=# select octet_length(normalize(U&'\+008FBA' || U&'\+0E0102'));
 octet_length 
--------------
            7
(1 row)

            regards, tom lane



pgsql-admin by date:

Previous
From: Holger Jakobs
Date:
Subject: Re: About Unicode IVS
Next
From: 荒井元成
Date:
Subject: RE: About Unicode IVS