Thread: cyrillic and sort order (ORDER BY)

cyrillic and sort order (ORDER BY)

From
Mihail Marinov
Date:
Hello,

My problem is that ORDER BY does not alphabetically order strings in
proper way. Worst - it does sometimes work right but I can't find out
exactly why. But after a while it stops until I rebuild and
reconfigure.

I have read *everything* I managed to find in the archives and the
docs. I am using RedHat 6.2, 2.2.14; PostgreSQL 7.0.2 (at the start),
now (hoping it would have magically save me from this nightmare)
7.0.3. Built with --enable-locale and --enable-multibyte=WIN (does
this mean more than the default encoding choice for createdb?; I have
used KOI8 with the same "success"). I have LC_ALL set to ru_RU.KOI8
and a database with encoding KOI8. The postmaster is started with
these locale settings. I use charset.conf with the shipped recoding
tables. All my clients are win. I hope I haven't missed something
important. Can someone, please, help me with a little. I have spent
huge amounts of time on this and all get is an erratic behaviour
which drives me crazy. I would have preferred if it was not working
at all.

Thank you for your time.



Re: cyrillic and sort order (ORDER BY)

From
Peter Eisentraut
Date:
Mihail Marinov writes:

> My problem is that ORDER BY does not alphabetically order strings in
> proper way. Worst - it does sometimes work right but I can't find out
> exactly why. But after a while it stops until I rebuild and
> reconfigure.

You must have the same collation locale setting (LC_COLLATE, or LC_ALL as
default) when running initdb and when running postmaster.  Changing the
setting in between will result in random and incorrect results and
database corruption.

--
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/


Re: cyrillic and sort order (ORDER BY)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Mihail Marinov writes:
>> My problem is that ORDER BY does not alphabetically order strings in
>> proper way. Worst - it does sometimes work right but I can't find out
>> exactly why. But after a while it stops until I rebuild and
>> reconfigure.

> You must have the same collation locale setting (LC_COLLATE, or LC_ALL as
> default) when running initdb and when running postmaster.  Changing the
> setting in between will result in random and incorrect results and
> database corruption.

To enlarge a little more: the most common way to get burnt by this is
to have different LC_xxx environment variables when starting the
postmaster from a boot script as you do when running initdb or starting
the postmaster from an interactive shell.  Best to explicitly set the
desired locale in the script that fires up the postmaster.

Postgres 7.1 solves this problem by locking down a database's locale at
initdb time.  Subsequent postmaster runs will adopt the LC_COLLATE value
that was prevalent when initdb ran, no matter what their environment is.

            regards, tom lane

Re[2]: cyrillic and sort order (ORDER BY)

From
Mihail Marinov
Date:
Thanx for the help. Actually it has happened the way you have
described. ORDER BY works ok now. It all would have been great if I
didn't find out that LOWER and UPPER don't do what they are supposed
to do on cyrillic characters. But that is LC_COLLATE issue I am
dealing with now.



Re: cyrillic and sort order (ORDER BY)

From
Radoslaw Stachowiak
Date:
*** Tom Lane <tgl@sss.pgh.pa.us> [Tuesday, 02.January.2001, 18:38 -0500]:
> To enlarge a little more: the most common way to get burnt by this is
> to have different LC_xxx environment variables when starting the
> postmaster from a boot script as you do when running initdb or starting
> the postmaster from an interactive shell.  Best to explicitly set the
> desired locale in the script that fires up the postmaster.
>
> Postgres 7.1 solves this problem by locking down a database's locale at
> initdb time.  Subsequent postmaster runs will adopt the LC_COLLATE value
> that was prevalent when initdb ran, no matter what their environment is.
>

Could You write what is suggested path to change encoding from SQLASCII to
another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?

I know have SQLASCII, but I'll need to change it to sth more suitable.

--
radoslaw.stachowiak.........................................http://alter.pl/

Re: cyrillic and sort order (ORDER BY)

From
Tom Lane
Date:
Radoslaw Stachowiak <radek@alter.pl> writes:
> Could You write what is suggested path to change encoding from SQLASCII to
> another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?
> I know have SQLASCII, but I'll need to change it to sth more suitable.

Should be sufficient to make sure you have the LOCALE environment
variables set correctly when you run initdb for 7.1.

            regards, tom lane

Re[2]: cyrillic and sort order (ORDER BY)

From
Mihail Marinov
Date:
RS> *** Tom Lane <tgl@sss.pgh.pa.us> [Tuesday, 02.January.2001, 18:38 -0500]:
>> To enlarge a little more: the most common way to get burnt by this is
>> to have different LC_xxx environment variables when starting the
>> postmaster from a boot script as you do when running initdb or starting
>> the postmaster from an interactive shell.  Best to explicitly set the
>> desired locale in the script that fires up the postmaster.
>>
>> Postgres 7.1 solves this problem by locking down a database's locale at
>> initdb time.  Subsequent postmaster runs will adopt the LC_COLLATE value
>> that was prevalent when initdb ran, no matter what their environment is.
>>

RS> Could You write what is suggested path to change encoding from SQLASCII to
RS> another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?

RS> I know have SQLASCII, but I'll need to change it to sth more suitable.

Wouldn't it be enough to dump the database out, recreate a new one with the
required encoding, than dump it in ?