Re: [WIP] patch - Collation at database level - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: [WIP] patch - Collation at database level
Date
Msg-id 20080802140727.GB4098@svana.org
Whole thread Raw
In response to Re: [WIP] patch - Collation at database level  ("Radek Strnad" <radek.strnad@gmail.com>)
List pgsql-hackers
On Sat, Aug 02, 2008 at 03:39:18PM +0200, Radek Strnad wrote:
> >  I also think that the clauses you have attached to your CREATE
> > COLLATION statement (case-insensitive, accent-insensitive) are an
> > oversimplification of reality.  I suggest you look up the Unicode
> > collation algorithm to learn about who collations work in practice.
>
> I already did in the very beginning of the development. The reason why I'm
> not implementing the whole Unicode collation algorithm is that this patch
> shold be sort of framework. You'll be able to use different collation
> functions not only POSIX locales so further development towards full Unicode
> collation algorithm is possible.

Agreed. Ofcourse it's a simplification of reality. POSIX locales are a
simplification of reality, but its the only form of collation currently
available to us. And quite frankly, I don't beleive postgresql should
be in the business of writing collation algorithms, we don't have the
expertese.

FWIW, I think case-insensitive and accent-insensitive are useful modifiers
that we should aim to support in the future.

> At the end of the next week I'll publish my bachelor thesis concerning this
> topic where everything will be explained in details so stay tuned.

Good luck!

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

pgsql-hackers by date:

Previous
From: "Radek Strnad"
Date:
Subject: Re: [WIP] patch - Collation at database level
Next
From: Andrew Dunstan
Date:
Subject: Re: [PATCHES] odd output in restore mode