Home > mailing lists

Re: Unicode support - Mailing list pgsql-hackers

From	Kevin Grittner
Subject	Re: Unicode support
Date	April 14, 2009 15:15:07
Msg-id	49E47D9B.EE98.0025.0@wicourts.gov Whole thread Raw
In response to	Re: Unicode support (Greg Stark <stark@enterprisedb.com>)
List	pgsql-hackers

Tree view

Greg Stark <stark@enterprisedb.com> wrote: 
> Peter Eisentraut <peter_e@gmx.net> wrote:
>> SELECT U&'\00E9', char_length(U&'\00E9');
>>  ?column? | char_length
>> ----------+-------------
>>  é        |           1
>> (1 row)
>>
>> SELECT U&'\0065\0301', char_length(U&'\0065\0301');
>>  ?column? | char_length
>> ----------+-------------
>>  é        |           2
>> (1 row)
> 
> What's really at issue is "what is a string?". That is, it a
> sequence of characters or a sequence of code points.
Doesn't the SQL standard refer to them as "character string literals"?
The function is called character_length or char_length.
I'm curious -- can every multi-code-point character be normalized to a
single-code-point character?
-Kevin

pgsql-hackers by date:

From: Tom Lane
Date: 14 April 2009, 15:11:24
Subject: Re: Unicode support

From: "Kevin Grittner"
Date: 14 April 2009, 15:15:10
Subject: Re: proposal: add columns created and altered topg_proc and pg_class

Re: Unicode support - Mailing list pgsql-hackers

Previous

Next