Re: langauges, locales, regex, LIKE - Mailing list pgsql-general

From Dennis Gearon
Subject Re: langauges, locales, regex, LIKE
Date
Msg-id 40DAFF53.8080703@fireserve.net
Whole thread Raw
In response to Re: langauges, locales, regex, LIKE  (Richard Huxton <dev@archonet.com>)
List pgsql-general
Richard Huxton wrote:

> Dennis Gearon wrote:
>
>> If I've read everything right, in order to get:
>>
>>     multiple languages on a site
>>
>> with the functionality of ALL of:
>>         REGEX
>>     LIKE
>>     Correctly sorted text
>>
>> A site would have to:
>>
>>     create a cluster for every language needed
>>     run a separate database instance for every language
>>     and have the database instances each have their own port
>>     and use 8 bit encoding for that specific language
>
>
> You'd need a separate database, not a separate cluster. Each database
> can then have their own encoding and locale.

If I wanted all the languages to be running concurently, I can't switch clusters that the database is connected to on
thefly, right? The database stays in the cluster it was started in, right? So, if that's true, then I need separate
databaseinstances if I want truly accurate sorting. 

>
>> because:
>>
>>     Sorting is fixed at cluster/directory creation per single
>>         database instance
>
>
> To clarify, a cluster is a group of databases that share user logins and
> can all be accessed via the same server.
>
>>     And LIKE only works on C Locale with an eight bit encoding
>>     and sorting (MAYBE?) works only on 8 bit encoding
>>     when using C Locale.
>
>
> You can sort, and I believe use LIKE on UTF etc. However, index use is a
> different matter.

Yup, there is no facility to declare character sets for indexes.

>
>> If anyone can correct me on this, I'd love to hear it.
>>
>> Boy, the old LOCALE system has really got to go someday.
>
>
> The issue isn't so much the difficulty of supporting multiple locales
> (AFAIK). I believe it's more to do with interactions. If you have a
> table containing multiple languages in the same column, what does it
> mean to sort that table? Do you sort by language-name then by languages?
>   If you don't, what rules do you follow?
>
> What happens if we compare different languages?
> Does fr/fr:"a" == en/gb:"a"?
> Does en/gb:"hello" == en/us:"hello"?
>
> Messy, isn't it?
>
Without languge specific characters, they will sort exactly the same.

pgsql-general by date:

Previous
From: Dennis Gearon
Date:
Subject: Re: unicode and sorting(at least)
Next
From: Dennis Gearon
Date:
Subject: Re: langauges, locales, regex, LIKE