Re: Unicode support - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Unicode support
Date
Msg-id 200904200929.45214.peter_e@gmx.net
Whole thread Raw
In response to Re: Unicode support  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sunday 19 April 2009 18:54:45 Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > On Monday 13 April 2009 20:18:31 - - wrote:
> >> 1) Functions like char_length() or length() do NOT return the number
> >> of characters (the manual says they do), instead they return the
> >> number of code points.
> >
> > I have added a Todo item about possibly fixing this.
>
> I thought the conclusion of the thread was that this wasn't wrong?

The only consensus I saw was that the normal form of an existing Unicode 
string shouldn't be altered by PostgreSQL.  That's pretty clear.

However, no one was entirely clear on the matter of how combining characters 
are supposed to be processed.  And even if we think that the current 
interfaces give the right answer, there should possibly be other interfaces 
that give the other right answer.  It needs more research first of all.


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Patch for 8.5, transformationHook
Next
From: Pavel Stehule
Date:
Subject: Re: Patch for 8.5, transformationHook