Thread: locale and character set

locale and character set

From
Tsirkin Evgeny
Date:
Hi list!
Is there any relation between the locale and character set?
For example if i store the data as sql_ascii can i still use the locale
as utf-8?
In my case i have the data stored as ascii but i just know it is
actually utf-8 and i am doing upgrade ,
So i would like to leave the ascii as internal format in case there are
any non legal chars in db
but allow text search on non english characters .
Note that i don't want any server - client data conversion,i just want
the server assume that it
is dealing with utf-8.
Thanks.
Evgeny.

Re: locale and character set

From
Peter Eisentraut
Date:
Am Donnerstag, 31. März 2005 15:49 schrieb Tsirkin Evgeny:
> Is there any relation between the locale and character set?

Every locale expects a certain character set to be used.  You can find that
out using

$ LC_ALL=foo locale charmap

If you want things to function correctly, you have to use a character set that
matches the one the locale expects.

> For example if i store the data as sql_ascii can i still use the locale
> as utf-8?
> In my case i have the data stored as ascii but i just know it is
> actually utf-8 and i am doing upgrade ,

That should work, but of course you have no guarantees that the UTF-8 is
valid, so the sorting routines and others may behave erratically if they find
an error.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: locale and character set

From
Tsirkin Evgeny
Date:
Peter Eisentraut wrote:

>Am Donnerstag, 31. März 2005 15:49 schrieb Tsirkin Evgeny:
>
>
>>Is there any relation between the locale and character set?
>>
>>
>
>Every locale expects a certain character set to be used.  You can find that
>out using
>
>$ LC_ALL=foo locale charmap
>
>If you want things to function correctly, you have to use a character set that
>matches the one the locale expects.
>
>
>
Of course i understand that i was just interested in cases were no
particular charset is inforced like SQL_ASCII
but i still want the sorting and searching to work.
The question is also in such case what the server will do if it finds a
character that is not utf-8 ?
I understand from manual that it just show it's hex value ,is that right?

>>For example if i store the data as sql_ascii can i still use the locale
>>as utf-8?
>>In my case i have the data stored as ascii but i just know it is
>>actually utf-8 and i am doing upgrade ,
>>
>>
>
>That should work, but of course you have no guarantees that the UTF-8 is
>valid, so the sorting routines and others may behave erratically if they find
>an error.
>
>
>
The question is how it will behave will this eraise an error or it will
just not sork correctly?



Re: locale and character set

From
Peter Eisentraut
Date:
Am Donnerstag, 31. März 2005 16:12 schrieb Tsirkin Evgeny:
> The question is how it will behave will this eraise an error or it will
> just not sork correctly?

That depends entirely on your operating system's C library.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/



Re: locale and character set

From
Peter Eisentraut
Date:
Am Donnerstag, 31. März 2005 16:12 schrieb Tsirkin Evgeny:
> The question is how it will behave will this eraise an error or it will
> just not sork correctly?

That depends entirely on your operating system's C library.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/



Re: locale and character set

From
Peter Eisentraut
Date:
Am Donnerstag, 31. März 2005 15:49 schrieb Tsirkin Evgeny:
> Is there any relation between the locale and character set?

Every locale expects a certain character set to be used.  You can find that
out using

$ LC_ALL=foo locale charmap

If you want things to function correctly, you have to use a character set that
matches the one the locale expects.

> For example if i store the data as sql_ascii can i still use the locale
> as utf-8?
> In my case i have the data stored as ascii but i just know it is
> actually utf-8 and i am doing upgrade ,

That should work, but of course you have no guarantees that the UTF-8 is
valid, so the sorting routines and others may behave erratically if they find
an error.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/