Re: order by different on mac vs linux - Mailing list pgsql-sql

From Wes James
Subject Re: order by different on mac vs linux
Date
Msg-id CAFjCMHuZq3ZxdpDKPd25HtgGwk9Q3uSaqTUjHaLKykhGQZEPcA@mail.gmail.com
Whole thread Raw
In response to Re: order by different on mac vs linux  (Samuel Gendler <sgendler@ideasculptor.com>)
Responses Re: order by different on mac vs linux
List pgsql-sql


On Wed, May 16, 2012 at 5:00 PM, Samuel Gendler <sgendler@ideasculptor.com> wrote:


On Wed, May 16, 2012 at 3:46 PM, Wes James <comptekki@gmail.com> wrote:


On Mon, May 14, 2012 at 5:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Wes James <comptekki@gmail.com> writes:
> Why is there a different order on the different platforms.

This is not exactly unusual.  You should first check to see if
lc_collate is set differently in the two installations --- but even if
it's the same, there are often platform-specific interpretations of
the sorting rules.  (Not to mention that OS X is flat out broken when
it comes to sorting UTF8 data ...)


I just ran these:

linux:

on linux

# SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM pg_user;
 case
-------
 FALSE
(1 row)

# show lc_collate;

 lc_collate
-------------
 en_US.UTF-8
(1 row)

------------------------

on mac os x:

# SELECT CASE WHEN 'apache' > '!yada' THEN 'TRUE' ELSE 'FALSE' END FROM pg_user;
 case 
------
 TRUE
(1 row)

# show lc_collate;
 lc_collate  
-------------
 en_US.UTF-8
(1 row) 


-----------------------

Why is the linux postgres saying false with the lc_collage set the way it is?

That's the point - UTF-8 collation is just completely broken under OS X.  There's much previous discussion of the topic on this list and elsewhere.  If you're developing on OS X but running linux and you are mostly using an ascii character set in your test dataset, set your development OS X boxes to use C collation, which will basically do what you expect it do do until you start throwing multibyte characters at it.  If you can't constrain your testing/development dataset in such a manner and collation order really matters during development, then you probably shouldn't develop on OS X.  I spent a fair amount of time investigating how to define a new charset in what proved to ultimately be a futile attempt to redefine UTF-8 on OSX to behave just like ti does on Linux.  I just gave it up after wasting a few too many hours on it. It may be possible to do it, but the return on invested time was non-existent for me so I abandoned my effort.


Why are people saying os x is broken in my case?  Looking at  http://www.utf8-chartable.de/  and  http://www.asciitable.com/  for sort order, ! should come before apache.  On os x it is correct, on ubuntu linux it is not.  In the order by output per my previous emails, it is correct on os x, but no on linux.  Why do people keep saying os x is broken, when it is doing the correct thing from what I've seen?

-wes 

pgsql-sql by date:

Previous
From: Samuel Gendler
Date:
Subject: Re: order by different on mac vs linux
Next
From: Scott Marlowe
Date:
Subject: Re: order by different on mac vs linux