Thread: BUG #6056: sorting issues
The following bug has been logged online: Bug reference: 6056 Logged by: Mike Hepworth Email address: michael.w.hepworth@gmail.com PostgreSQL version: 8.4 Operating system: Ubuntu 10.4 Description: sorting issues Details: Have database that is utf-8 encoding so that I can support both english, french, and spanish. Having problems with the ordering of data. I execute the following sql statment: select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN, DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1; and get the following results HALL, ANDY HALLBERG, FRANK HALLDEN, DOUGLAS HALL, DOUGLAS Please Help...
On Tue, Jun 14, 2011 at 06:52, Mike Hepworth <michael.w.hepworth@gmail.com> wrote: > Bug reference: =C2=A0 =C2=A0 =C2=A06056 > Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Mike Hepworth > Email address: =C2=A0 =C2=A0 =C2=A0michael.w.hepworth@gmail.com > PostgreSQL version: 8.4 > Operating system: =C2=A0 Ubuntu 10.4 > Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0sorting issues > Details: > > Have database that is utf-8 encoding so that I can support both english, > french, and spanish. =C2=A0Having problems with the ordering of data. > > I execute the following sql statment: > > select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN, > DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1; > > and get the following results > > HALL, ANDY > HALLBERG, FRANK > HALLDEN, DOUGLAS > HALL, DOUGLAS What locale did you use? Sort order depends on not only encoding but also collation. Please check "SHOW lc_collate". Your locale seems to just ignore symbol characters. I have an expected result at least in C locale. --=20 Itagaki Takahiro
On Mon, Jun 13, 2011 at 09:52:06PM +0000, Mike Hepworth wrote: > I execute the following sql statment: > > select * from (values('HALL, ANDY'),('HALLBERG, FRANK'),('HALLDEN, > DOUGLAS'),('HALL, DOUGLAS') ) x ORDER BY 1; > > and get the following results > > HALL, ANDY > HALLBERG, FRANK > HALLDEN, DOUGLAS > HALL, DOUGLAS > Please Help... it would help if you would say what do you think is wrong with it. because, the data is well sorted - prefix hall, and then, the next sortable character is: A B D D in 3rd and 4th string 2nd character is also taken into consideration because 1st is the same, so it becomes: A B DE DO so the sort order is correct. if you'd expect to get output: > HALL, ANDY > HALL, DOUGLAS > HALLBERG, FRANK > HALLDEN, DOUGLAS then the proper solution is to use separate fields for lastname and firstname, and not put them both in the same column. In Pg 9.1 you will be able to sort it the way you want using COLLATE support. with current situation, you can try to split every string using ", ", and sort splitted elements separately. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
On Tue, Jun 14, 2011 at 21:28, Michael Hepworth <michael.w.hepworth@gmail.com> wrote: > Thanks for the response.=C2=A0 I have created a database with lc_collate = =3D > en_US.utf8.=C2=A0 This doesn't seem to work.=C2=A0 I does however allow m= e to sort my > non english names correctly.=C2=A0 If I drop the database and recreate it= with > lc_collate =3D C then my english names are ordered correctly but my > non-english names are sorted incorrectly. I'm not sure you can use non english names in english locale. Instead, should you use another locale that covers all characters you use? I guess locales in libc on Ubuntu 10.4 is just broken or behaves undesirable as a feature. When I created english locale on Windows, it returnes expected results in both cases. Personally, I don't trust locale functions in libc on Linux platforms... --=20 Itagaki Takahiro