Thread: resend: Chinese sort order problem

resend: Chinese sort order problem

From
Ren Weili
Date:
hi all,
    I am shade to disturb  people here again, but to my Question , i
have not effective answer.
    it has been suffering me for 3 weeks.

    the problem is : How the Postgresql treet the Chinese Text Field
sort(order by)?
    It seems always false sort order. but the MySQL does this very well.

    in my machine "select getdatabaseencoding();" returns "EUC_CN"
    and "\encoding" returns "EUC_CN" too.
    I have tested for "createdb -E EUC_CN / SQL_ASII / LATIN1" ,
    the option "EUC_CN" / "SQL_ASII"  results same .

    I have also tried recompling the source with "configure
--enable-locale --enable-multibyte=EUC_CN", it does no help.

    So i want to know where is the source code of SQL "ORDER BY", if
nobody else could help me, I can only choose to
study the source and solve it by myself.


Any suggestion is kind to me.
Thanks.
Malix

Re: resend: Chinese sort order problem

From
Jean-Michel POURE
Date:
Hello Malix,

There are some limitations to multi-byte support. I had the same problems,
see another mail I forwarded.
I am new to the list but I suggest you sent this message to
pgsql-hackers@postgresql.org.

Regards,
Jean-Michel POURE

At 15:45 29/10/01 +0800, you wrote:
>hi all,
>         I am shade to disturb  people here again, but to my Question , i
>have not effective answer.
>         it has been suffering me for 3 weeks.
>
>         the problem is : How the Postgresql treet the Chinese Text Field
>sort£¨order by£©?
>         It seems always false sort order. but the MySQL does this very well.
>
>         in my machine "select getdatabaseencoding();" returns "EUC_CN"
>         and "\encoding" returns "EUC_CN" too.
>         I have tested for "createdb -E EUC_CN / SQL_ASII / LATIN1" ,
>         the option "EUC_CN" / "SQL_ASII"  results same .
>
>         I have also tried recompling the source with "configure
>--enable-locale --enable-multibyte=EUC_CN", it does no help.
>
>         So i want to know where is the source code of SQL "ORDER BY", if
>nobody else could help me, I can only choose to
>study the source and solve it by myself.
>
>
>Any suggestion is kind to me.
>Thanks.
>Malix
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org


Re: resend: Chinese sort order problem

From
Tatsuo Ishii
Date:
> hi all,
>     I am shade to disturb  people here again, but to my Question , i
> have not effective answer.
>     it has been suffering me for 3 weeks.
>
>     the problem is : How the Postgresql treet the Chinese Text Field
> sort(order by)?
>     It seems always false sort order. but the MySQL does this very well.
>
>     in my machine "select getdatabaseencoding();" returns "EUC_CN"
>     and "\encoding" returns "EUC_CN" too.
>     I have tested for "createdb -E EUC_CN / SQL_ASII / LATIN1" ,
>     the option "EUC_CN" / "SQL_ASII"  results same .
>
>     I have also tried recompling the source with "configure
> --enable-locale --enable-multibyte=EUC_CN", it does no help.

You did not mention what is "false" and what is "correct" sort order.
So I'm not sure I understand you problem, but I experience similar
ones with Japanese on some Linux platforms. In my case the source of
the problem was broken locale database coming with Linux. After
re-building PostgreSQL WITHOUT --enable-locale, all problems were
gone.

Hope this helps.
--
Tatsuo Ishii

Re: resend: Chinese sort order problem

From
Ren Weili
Date:
hello Tatsuo,
thank you very much, your suggestion solved my problem now.
It can now "order by" with correct Chinese PinYin sort order. I am really
very happy.
Would you like to change the description in the Reference Manual for
Installation?
My Problem is a result following the Information in that Chapter, always
recompiling with "--enable-locale" for my locale support.
Maybe I have incorrect understand about the following:
     --enable-locale :: for support locale language sort order
         and --enable-multibyte :: for multibyte character storing ?


So as a conclusion , when people want Chinese support in Postgres, he must
recompile the source,
with --enable-multibyte=EUC_CN, but without --enable-locale.

Thanks
Malix


Re: resend: Chinese sort order problem

From
Tatsuo Ishii
Date:
> hello Tatsuo,
> thank you very much, your suggestion solved my problem now.
> It can now "order by" with correct Chinese PinYin sort order. I am really
> very happy.
> Would you like to change the description in the Reference Manual for
> Installation?
> My Problem is a result following the Information in that Chapter, always
> recompiling with "--enable-locale" for my locale support.
> Maybe I have incorrect understand about the following:
>      --enable-locale :: for support locale language sort order
>          and --enable-multibyte :: for multibyte character storing ?

I would say multibyte character sets sorting using locale would never
work or it returns the same result without locale at best.

> So as a conclusion , when people want Chinese support in Postgres, he must
> recompile the source,
> with --enable-multibyte=EUC_CN, but without --enable-locale.

That might be a good idea. The locale support is useless anyway for
multibyte character sets such as Chinese and Japanese. I'm not sure
about traditional Chinese, Korean and Unicode though.

Another workaround would be using "C" locale anytime. But the question
is, if always using "C" locale, why you need the locale support:-)
--
Tatsuo Ishii

Re: resend: Chinese sort order problem

From
Jean-Michel POURE
Date:
At 12:39 30/10/01 +0800, you wrote:
>         --enable-locale :: for support locale language sort order
>          and --enable-multibyte :: for multibyte character storing ?

Does --enable-multibyte set --enable-locale automatically in 7.2?
Do I miss something?

Regards,
Jean-Michel



Re: resend: Chinese sort order problem

From
Tatsuo Ishii
Date:
> At 12:39 30/10/01 +0800, you wrote:
> >         --enable-locale :: for support locale language sort order
> >          and --enable-multibyte :: for multibyte character storing ?
>
> Does --enable-multibyte set --enable-locale automatically in 7.2?

No. --enable-multibyte just implies --enable-unicode-conversion.
--
Tatsuo Ishii