Re: Patch for collation using ICU - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Patch for collation using ICU
Date
Msg-id 200505071406.j47E6h600785@candle.pha.pa.us
Whole thread Raw
In response to Re: Patch for collation using ICU  (Palle Girgensohn <girgen@pingpong.net>)
Responses Re: Patch for collation using ICU
Re: Patch for collation using ICU
List pgsql-hackers
Palle Girgensohn wrote:
> 
> --On l?rdag, maj 07, 2005 23.15.29 +1000 John Hansen <john@geeknet.com.au> 
> wrote:
> 
> > Btw, I had been planning to propose replacing every single one of the
> > built in charset conversion functions with calls to ICU (thus making pg
> > _depend_ on ICU), as this would seem like a cleaner solution than for us
> > to maintain our own conversion tables.
> >
> > ICU also has a fair few conversions that we do not have at present.

That is a much larger issue, similar to our shipping our own timezone
database.  What does it buy us?o  Do we ship it in our tarball?o  Is the license compatible?o  Does it remove utils/mb
conversions?o Does it allow us to index LIKE (next high char)?o  Does it allow us to support multiple encodings in   a
singledatabase easier?o  performance?
 

> I just had a similar though. And why use ICU only for multibyte charsets? 
> If I use LATIN1, I still expect upper('?') => SS, and I don't get it... 
> Same for the Turkish example.

We assume the native toupper() can handle single-byte character
encodings.  We use towupper() only for wide character sets.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Patch for collation using ICU
Next
From: Bruce Momjian
Date:
Subject: Re: Patch for collation using ICU