Re: Is this a BUG? Is there anyone has the same problem? - Mailing list pgsql-sql

From David Stanaway
Subject Re: Is this a BUG? Is there anyone has the same problem?
Date
Msg-id 1019496962.24241.36.camel@ciderbox
Whole thread Raw
In response to Re: Is this a BUG? Is there anyone has the same problem?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Is this a BUG? Is there anyone has the same problem?
Re: Is this a BUG? Is there anyone has the same problem?
List pgsql-sql
Hi Tom.

This seems to be a bit of a FAQ at the moment...  Do you think it should
be added to the release notes? Maybe the people that build the packages
for different distributions of Linux need to have some kind of warning
in the installation scripts which let the users pic the locale used for
Postgres by default.


Cheers...

--
David Stanaway


On Mon, 2002-04-22 at 11:35, Tom Lane wrote:
> "jack" <datactrl@tpg.com.au> writes:
> >> What locale is your database running in?
>
> > My locale is en_AU (on Redhat 7.2)
>
> Hmph.  It seems to be a peculiarity of the locale sorting rules for
> English.  Using RedHat 7.2, I made a file containing 3 lines, the last
> of which has one trailing blank:
>
> [tgl@rh1 tgl]$ cat test
> AAB
> AA B
> AAB
>
> -- hmm, can't see the spaces very well, so do this:
>
> [tgl@rh1 tgl]$ sed 's/ /_/g' test
> AAB
> AA_B
> AAB_
>
> -- Now sort under Aussie rules:
>
> [tgl@rh1 tgl]$ LANG=en_AU sort test
> AAB
> AA B
> AAB
>
> -- uh, let's try looking to see where the spaces are:
>
> [tgl@rh1 tgl]$ LANG=en_AU sort test | sed 's/ /_/g'
> AAB
> AA_B
> AAB_
>
> -- Not too consistent, eh?  I get the same results with en_US though:
>
> [tgl@rh1 tgl]$ LANG=en_US sort test | sed 's/ /_/g'
> AAB
> AA_B
> AAB_
>
> -- but traditional "C" locale does this:
>
> [tgl@rh1 tgl]$ LANG=C sort test | sed 's/ /_/g'
> AA_B
> AAB
> AAB_
>
>
> The reason that your SQL tests reflect this is that comparisons for type
> CHAR(n) remove any trailing blanks before comparing; but the result of
> substr() is of type TEXT, so it assumes trailing blanks are significant.
> So the data you were sorting were in the one case effectively
>
> 'AA B'
> 'AAB'
> 'BB 123'
> 'BB123'
>
> and in the other case
>
> 'AA B'
> 'AAB '
> 'BB 1'
> 'BB12'
>
> and the locale sort rules treat 'AAB' differently from 'AAB '.
>
> If you think that's a bug, you can take it up with whoever maintains
> Linux's locale rules.  It ain't our bug though.  (You might prefer
> to initdb under C locale if you'd rather sort according to C rules.)
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Is this a BUG? Is there anyone has the same problem?
Next
From: Tom Lane
Date:
Subject: Re: Is this a BUG? Is there anyone has the same problem?