Am Samstag, den 29.10.2005, 13:11 +0400 schrieb Zet:
> Hi
>
> Which charset is need to be set in database for cyrilic?
>
> I've used till now WIN, but today I found a problem
win?
>
> for example:
>
> SELECT *
> FROM table
> WHERE a = 'слово'
>
> returns me a record, where a = 'фраза'
>
> after I tried UNICODE
> but for most of cyrilic words PG gives error like
> "invalid byte sequence for encoding "UNICODE":..."
Well for cyrillic, you have the following options:
cp-1251 (windows codepage)
koi-8 (traditional charset)
utf-8 (universal, if you want to have latin characters coexist
with cyrillic. This is also what you get with the
UNICODE setting in PG)
You should use the same encoding in the database as
you use in your application to make things easier.
Now you have some data already in your database.
So if you want to change the encoding, you need
to recode your char, varchar and text.
1. ) find out the setting of your database:
show server_encoding()
if this matches, what you want, you are ready with
this step.
if you get something like SQL_ASCII, then you dont
know what charset actually got used - inspect your
application in this case which encoding it used
to store text.
Make a complete backup, check your
lc_* variables:
SHOW LC_MESSAGES; (and so on)
If its not something like
ru_RU@utf8 (if its UNICODE you want to use)
Then you better run initdb again with the
correct locales setting. This is important
for lower(),upper(), ilike, oder by, etc.
to work.
recreate your DB with setting UNICODE (or
whatever you want to use - same as with the
locales)
create a text dump out of your dump via
pg_restore (its recommended to backup using pg_dump -Fc)
relace the occurences of
SET CLIENT_ENCODING TO '...'; (this is what your
original database had) With what you now want
as encoding:
SET CLIENT_ENCODING TO 'UNICODE';
(this can be done with sed if you dont want
to load all the dump in your editor)
restore the database with the new cript.
Postgres will take care of the charset conversion.