Re: ORDER BY and Unicode - Mailing list pgsql-novice

From Stephan Szabo
Subject Re: ORDER BY and Unicode
Date
Msg-id 20040512144759.Y82605@megazone.bigpanda.com
Whole thread Raw
In response to Re: ORDER BY and Unicode  ("M. Bastin" <marcbastin@mindspring.com>)
List pgsql-novice
On Wed, 12 May 2004, M. Bastin wrote:

> >  > And how can I do an initdb so that sorting on Unicode will work for
> >>  French, Greek, Japanase, etc. users of a single database?
> >
> >AFAIK, you can't really at this time.  With an appropriately crafted
> >locale, you could probably get reasonably close, but I've never actually
> >tried to work with creating one so I don't know what's involved. And, if
> >two languages had different rules for two characters you'd not be
> >supporting both.
>
> Thanks Stephan!  I've found my list of locales. It's a pity only one
> language can be used at a time but as you say there are conflicting
> rules anyway.
>
> The docs say there is a speed penalty on using locales.  Does anyone
> have any idea on how severe this is?  I'm wondering wether I should

I'm not an expert really, but since you're already doing unicode I think
it's not going to be major with the one caveat that if you're doing LIKE
queries, you should look at the Operator Classes section of the
documentation about the *_pattern_ops operator classes.

> use the translate() function after all because of this.  It would
> solve multilingual issues to a certain level and there wouldn't be a
> speed penalty since the indexes would be build on the translate()
> function too.

The translate version would presumably work for cases where you want
multiple characters to sort to the same position, but if you want say an
accented A to follow a regular A I think it might be difficult to
formulate.

pgsql-novice by date:

Previous
From: "M. Bastin"
Date:
Subject: Re: ORDER BY and Unicode
Next
From: "Matthias Lenz"
Date:
Subject: Re: Changing a relation's name in parser stage