Home > mailing lists

Re: Unicode Normalization - Mailing list pgsql-hackers

From	Andrew Dunstan
Subject	Re: Unicode Normalization
Date	September 24, 2009 12:59:27
Msg-id	4ABB974D.5000104@dunslane.net Whole thread Raw
In response to	Re: Unicode Normalization ("David E. Wheeler" <david@kineticode.com>)
Responses	Re: Unicode Normalization
List	pgsql-hackers

Tree view

David E. Wheeler wrote:
> On Sep 24, 2009, at 6:24 AM, pg@thetdh.com wrote:
>
>> In a context using normalization, wouldn't you typically want to 
>> store a normalized-text type that could perhaps (depending on locale) 
>> take advantage of simpler, more-efficient comparison functions?
>
> That might be nice, but I'd be wary of a geometric multiplication of 
> text types. We already have TEXT and CITEXT; what if we had your NTEXT 
> (normalized text) but I wanted it to also be case-insensitive?

Actually, I don't think it's necessarily a good idea at all. If a user 
inputs a perfectly valid piece of UTF8 text, we should be able to give 
it back to them exactly, whether or not it's in normalized form. The 
normalized forms are useful for certain comparison purposes, but they 
don't affect the validity of the text. CITEXT doesn't mangle what is 
stored, just how it's compared.

cheers

andrew

pgsql-hackers by date:

From: "David E. Wheeler"
Date: 24 September 2009, 12:36:51
Subject: Re: Unicode Normalization

From: "David E. Wheeler"
Date: 24 September 2009, 13:06:12
Subject: Re: Unicode Normalization

Re: Unicode Normalization - Mailing list pgsql-hackers

Previous

Next