Thread: Very weird problem of "order by" in postgresql

Very weird problem of "order by" in postgresql

From
"Peter Cai"
Date:
Hi all,

I have 2 postgresql running on linux on 2 different physical machines.

Then I create 2 identical database on them, both using utf8 as server encoding and GBK as client encoding.

But when I try to order by some query result with a column containing Chinese characters, the result is different!!!!

One db return result in correct Chinese alphabet order but the other doesn't!

Is the problem of the configure of the operating system or the postgresql???

Thanks a lot!


PS : I read some solutions on this problem.  It seems that many people met the same order by problem and have to convert the strings to "GBK" in there sql to get correct sorted result.
I don't think this is the correct solution.

Re: Very weird problem of "order by" in postgresql

From
"Justin Graf"
Date:
This is the same problem postgresql has when doing sorting when run on  windows vs. linux.  Postgresql relies on the OS to handle collating aka sort orders.
 
to Quote 
PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system 

 


---- Message from "Peter Cai" <newptcai@gmail.com> at 09-08-2008 11:51:32 AM ------
Hi all,

I have 2 postgresql running on linux on 2 different physical machines.

Then I create 2 identical database on them, both using utf8 as server encoding and GBK as client encoding.

But when I try to order by some query result with a column containing Chinese characters, the result is different!!!!

One db return result in correct Chinese alphabet order but the other doesn't!

Is the problem of the configure of the operating system or the postgresql???

Thanks a lot!


PS : I read some solutions on this problem.  It seems that many people met the same order by problem and have to convert the strings to "GBK" in there sql to get correct sorted result.
I don't think this is the correct solution.

 

Re: Very weird problem of "order by" in postgresql

From
"Peter Cai"
Date:
Still the same problem……

What I did is:

1. set locale to "zh_cn.UTF8"
2. reboot
3. use "locale" command to see if locale is set.
4. delete everything under /lib/var/pgsql/data to make sure the database will be completely re-initialed
5. restart postgresql

I check pgstartup.log and I am sure the db is initialed with "zh_cn.UTF8".
But nothing changed!  The sorted result is still incorrect.

The locale is the same of both OS.
The 2 db is both initialized with "zh_cn"
The server encodes are all "utf8"

The only difference is that one OS is centos and the other is ubuntu
I've almost got crazy……
 

On Mon, Sep 8, 2008 at 12:39 PM, Justin Graf <justin@emproshunts.com> wrote:
This is the same problem postgresql has when doing sorting when run on  windows vs. linux.  Postgresql relies on the OS to handle collating aka sort orders.
 
to Quote 
PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system 

 


---- Message from "Peter Cai" <newptcai@gmail.com> at 09-08-2008 11:51:32 AM ------
Hi all,

I have 2 postgresql running on linux on 2 different physical machines.

Then I create 2 identical database on them, both using utf8 as server encoding and GBK as client encoding.

But when I try to order by some query result with a column containing Chinese characters, the result is different!!!!

One db return result in correct Chinese alphabet order but the other doesn't!

Is the problem of the configure of the operating system or the postgresql???

Thanks a lot!


PS : I read some solutions on this problem.  It seems that many people met the same order by problem and have to convert the strings to "GBK" in there sql to get correct sorted result.
I don't think this is the correct solution.

 



--
科幻小说可能在哲学上是天真的,在道德上是简单的,在美学上是有些主观的,或粗糙的,但是就它最好的方面而言,它似乎触及了人类集体梦想的神经中枢,解放出我们人类这具机器中深藏的某些幻想。

Re: Very weird problem of "order by" in postgresql

From
"Peter Cai"
Date:
Here is my locale settings:

LANG=zh_CN.UTF-8
LC_CTYPE="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=

2008/9/8 Peter Cai <newptcai@gmail.com>
Still the same problem……

What I did is:

1. set locale to "zh_cn.UTF8"
2. reboot
3. use "locale" command to see if locale is set.
4. delete everything under /lib/var/pgsql/data to make sure the database will be completely re-initialed
5. restart postgresql

I check pgstartup.log and I am sure the db is initialed with "zh_cn.UTF8".
But nothing changed!  The sorted result is still incorrect.

The locale is the same of both OS.
The 2 db is both initialized with "zh_cn"
The server encodes are all "utf8"

The only difference is that one OS is centos and the other is ubuntu
I've almost got crazy……
 

On Mon, Sep 8, 2008 at 12:39 PM, Justin Graf <justin@emproshunts.com> wrote:
This is the same problem postgresql has when doing sorting when run on  windows vs. linux.  Postgresql relies on the OS to handle collating aka sort orders.
 
to Quote 
PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system 

 


---- Message from "Peter Cai" <newptcai@gmail.com> at 09-08-2008 11:51:32 AM ------
Hi all,

I have 2 postgresql running on linux on 2 different physical machines.

Then I create 2 identical database on them, both using utf8 as server encoding and GBK as client encoding.

But when I try to order by some query result with a column containing Chinese characters, the result is different!!!!

One db return result in correct Chinese alphabet order but the other doesn't!

Is the problem of the configure of the operating system or the postgresql???

Thanks a lot!


PS : I read some solutions on this problem.  It seems that many people met the same order by problem and have to convert the strings to "GBK" in there sql to get correct sorted result.
I don't think this is the correct solution.

 



--
科幻小说可能在哲学上是天真的,在道德上是简单的,在美学上是有些主观的,或粗糙的,但是就它最好的方面而言,它似乎触及了人类集体梦想的神经中枢,解放出我们人类这具机器中深藏的某些幻想。



--
科幻小说可能在哲学上是天真的,在道德上是简单的,在美学上是有些主观的,或粗糙的,但是就它最好的方面而言,它似乎触及了人类集体梦想的神经中枢,解放出我们人类这具机器中深藏的某些幻想。

Re: Very weird problem of "order by" in postgresql

From
Tom Lane
Date:
"Peter Cai" <newptcai@gmail.com> writes:
> The only difference is that one OS is centos and the other is ubuntu

It's hardly impossible that those two contain different locale
definitions.  Have you tried comparing the results of sort(1)
under the same locale settings?

            regards, tom lane

Re: Very weird problem of "order by" in postgresql

From
"Peter Cai"
Date:
Oh, you are right!  Although the 'locale' command show that the 2 has the same locale settings.
The sort result of the same file is different.
Ubuntu is right, centos is wrong……

Maybe I should as this question on some Linux mail lists or forums.

Thanks a lot!

On Mon, Sep 8, 2008 at 8:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Peter Cai" <newptcai@gmail.com> writes:
> The only difference is that one OS is centos and the other is ubuntu

It's hardly impossible that those two contain different locale
definitions.  Have you tried comparing the results of sort(1)
under the same locale settings?

                       regards, tom lane



--
科幻小说可能在哲学上是天真的,在道德上是简单的,在美学上是有些主观的,或粗糙的,但是就它最好的方面而言,它似乎触及了人类集体梦想的神经中枢,解放出我们人类这具机器中深藏的某些幻想。

Re: Very weird problem of "order by" in postgresql

From
Martijn van Oosterhout
Date:
On Tue, Sep 09, 2008 at 09:04:02AM +0800, Peter Cai wrote:
> Oh, you are right!  Although the 'locale' command show that the 2 has the
> same locale settings.
> The sort result of the same file is different.
> Ubuntu is right, centos is wrong??????

Note that glibc uses the information in /usr/share/i18n/locales to
build the relevent tables. Maybe comparing those directories might
help.

There is also a patch to use ICU for collation instead of the OS. That
way you're not tied to any changes your OS might do.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment