Thread: BUG #6056: sorting issues

BUG #6056: sorting issues

From
"Mike Hepworth"
Date:
The following bug has been logged online:

Bug reference:      6056
Logged by:          Mike Hepworth
Email address:      michael.w.hepworth@gmail.com
PostgreSQL version: 8.4
Operating system:   Ubuntu 10.4
Description:        sorting issues
Details:

Have database that is utf-8 encoding so that I can support both english,
french, and spanish.  Having problems with the ordering of data.

I execute the following sql statment:

select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN,
DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1;

and get the following results

HALL, ANDY
HALLBERG, FRANK
HALLDEN, DOUGLAS
HALL, DOUGLAS

Please Help...

Re: BUG #6056: sorting issues

From
Itagaki Takahiro
Date:
On Tue, Jun 14, 2011 at 06:52, Mike Hepworth
<michael.w.hepworth@gmail.com> wrote:
> Bug reference: =C2=A0 =C2=A0 =C2=A06056
> Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Mike Hepworth
> Email address: =C2=A0 =C2=A0 =C2=A0michael.w.hepworth@gmail.com
> PostgreSQL version: 8.4
> Operating system: =C2=A0 Ubuntu 10.4
> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0sorting issues
> Details:
>
> Have database that is utf-8 encoding so that I can support both english,
> french, and spanish. =C2=A0Having problems with the ordering of data.
>
> I execute the following sql statment:
>
> select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN,
> DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1;
>
> and get the following results
>
> HALL, ANDY
> HALLBERG, FRANK
> HALLDEN, DOUGLAS
> HALL, DOUGLAS

What locale did you use? Sort order depends on not only
encoding but also collation.  Please check "SHOW lc_collate".
Your locale seems to just ignore symbol characters.
I have an expected result at least in C locale.

--=20
Itagaki Takahiro

Re: BUG #6056: sorting issues

From
hubert depesz lubaczewski
Date:
On Mon, Jun 13, 2011 at 09:52:06PM +0000, Mike Hepworth wrote:
> I execute the following sql statment:
>
> select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN,
> DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1;
>
> and get the following results
>
> HALL, ANDY
> HALLBERG, FRANK
> HALLDEN, DOUGLAS
> HALL, DOUGLAS
> Please Help...

it would help if you would say what do you think is wrong with it.
because, the data is well sorted - prefix hall, and then, the next
sortable character is:
A
B
D
D
in 3rd and 4th string 2nd character is also taken into consideration
because 1st is the same, so it becomes:

A
B
DE
DO

so the sort order is correct.

if you'd expect to get output:
> HALL, ANDY
> HALL, DOUGLAS
> HALLBERG, FRANK
> HALLDEN, DOUGLAS

then the proper solution is to use separate fields for lastname and
firstname, and not put them both in the same column.

In Pg 9.1 you will be able to sort it the way you want using COLLATE
support.

with current situation, you can try to split every string using ", ",
and sort splitted elements separately.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: BUG #6056: sorting issues

From
Itagaki Takahiro
Date:
On Tue, Jun 14, 2011 at 21:28, Michael Hepworth
<michael.w.hepworth@gmail.com> wrote:
> Thanks for the response.=C2=A0 I have created a database with lc_collate =
=3D
> en_US.utf8.=C2=A0 This doesn't seem to work.=C2=A0 I does however allow m=
e to sort my
> non english names correctly.=C2=A0 If I drop the database and recreate it=
 with
> lc_collate =3D C then my english names are ordered correctly but my
> non-english names are sorted incorrectly.

I'm not sure you can use non english names in english locale.
Instead, should you use another locale that covers all characters you use?

I guess locales in libc on Ubuntu 10.4 is just broken or behaves
undesirable as a feature. When I created english locale on Windows,
it returnes expected results in both cases.

Personally, I don't trust locale functions in libc on Linux platforms...

--=20
Itagaki Takahiro