Re: WIP patch: Collation support - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: WIP patch: Collation support
Date
Msg-id 48CFCB8F.7050308@sun.com
Whole thread Raw
In response to Re: WIP patch: Collation support  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas napsal(a):
> Martijn van Oosterhout wrote:
>> On Wed, Sep 10, 2008 at 12:51:02PM +0300, Heikki Linnakangas wrote:
>>>> Since the set of collations isn't exactly denumerable, we need some way
>>>> to allow the user to specify the collation they want. The only
>>>> collation PostgreSQL knows about is the C collation. Anything else is
>>>> user-defined.
>>> Let's just use the name of the OS locale, like we do now. Having a 
>>> pg_collation catalog just moves the problem elsewhere: we'd still 
>>> need something in pg_collation to tie the collation to the OS locale.
>>
>> There's not a one-to-one mapping between collation and locale name. A
>> locale name includes information about the charset and a collation may
>> have paramters like case-sensetivity and pad-attribute which are not
>> present in the locale name. You need a mapping anyway, which is what
>> this table is for.
> 
> Ideally, we would delegate the case-sensitivity and padding to the 
> collation implementation (ie. OS setlocale() or ICU). That said, I don't 
> think operating systems normally ship case-insensitive variants of 
> locales by default, so I agree it would be nice if we could implement 
> that ourselves. Still, we could identify case-sensitive locale names for 
> example by a suffix, like "en_GB.UTF8.case-insensitive".

The idea was to call to_upper (or to_lower) before case-sensitive 
collation processing. It is difficult to determine from suffix if it is 
sensitive or not.
    Zdenek

PS: We can discuss it in Prato


pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: WIP patch: Collation support
Next
From: Alvaro Herrera
Date:
Subject: Re: Subtransaction commits and Hot Standby