Re: Question on Bizarre Sorting (ORDER BY in 7.1) (fwd) - Mailing list pgsql-general
From | Barry Lind |
---|---|
Subject | Re: Question on Bizarre Sorting (ORDER BY in 7.1) (fwd) |
Date | |
Msg-id | 3AE707AB.7090107@xythos.com Whole thread Raw |
In response to | Re: Question on Bizarre Sorting (ORDER BY in 7.1) (fwd) (<pgsql-general@commandprompt.com>) |
List | pgsql-general |
The behavior seen here seems correct although not intuitive. My guess is that your database executables where created with locale support enabled and your database was inited with a US locale. The sorting logic for the english locales only sort on alpha-numeric characters. All other characters are ignored for sorting purposes. So these values: 000-0987877374-00313 ___-0987877410-00316 ___-0987877430-00317 100-0987877381-00314 100-0987877395-00315 200-0987877461-00318 get sorted as: 000098787737400313 098787741000316 098787743000317 100098787738100314 100098787739500315 200098787746100318 thus you see it is sorted correctly. If you look at all of the other examples you will see this explains all of the results you are seeing. What I think you really want here is a binary sort order, not a locale specific sort order. So you should initdb your database with a locale of C, or rebuild postgresql without locale support enabled. I think there is a contrib program that you can use to see what locale your database was created with. Once your database is created you can't change the locale, you will need to re-initdb. thanks, --Barry Justin Clift wrote: > Dunno if this is helpful, but there is a pattern. > > After stripping off the leading non-numeric characters, from the results > in the previous email, this leaves : > > lxp=# SELECT threadid FROM test ORDER BY threadid; > threadid > ---------------------- > 000-0987877374-00313 > 0987877410-00316 > 0987877430-00317 > 100-0987877381-00314 > 100-0987877395-00315 > 200-0987877461-00318 > (6 rows) > > lxp=# SELECT substr(threadid,1,5) FROM test ORDER BY substr(threadid, 1, > 5); > substr > -------- > 0 > 0 > 000-0 > 100-0 > 100-0 > 200-0 > (6 rows) > > lxp=# SELECT substr(threadid,1,6) FROM test ORDER BY substr(threadid, > 1,6); > substr > --------- > 000-09 > 09 > 09 > 100-09 > 100-09 > 200-09 > (6 rows) > > As for why? Don't know. > > But the sorting is consistent. > > Regards and best wishes, > > Justin Clift > > > pgsql-general@commandprompt.com wrote: > >> On Tue, 24 Apr 2001, Clayton Vernon wrote: >> >>> seems to be treating 'threadid' as numeric, not alpha. >>> did you run any macros that needed to guess the format of the field? many of >>> these are very poor in terms of looking down the table. sometimes the first >>> row can inadvertently define things for you. >> >> Nope, to my knowledge nothing is going on except for normal INSERTs >> and SELECTs. What confuses me is that 'threadid' is of type text, and even >> if it were being treated as if it were of type numeric, its results are >> still inconsistent, and somewhat inscrutable. >> >> I can't see any reasonable way that this sorting: >> >>>> lxp=# SELECT threadid FROM test ORDER BY threadid; >>>> threadid >>>> ---------------------- >>>> 000-0987877374-00313 >>>> ___-0987877410-00316 >>>> ___-0987877430-00317 >>>> 100-0987877381-00314 >>>> 100-0987877395-00315 >>>> 200-0987877461-00318 >>>> (6 rows) >>> >> ...would be occuring, unless it's going out of its way to try to strip >> non-numeric characters and treat the remaining assembled numbers as a >> whole number to sort by, possibly? >> >> But if that were the case, that still doesn't explain why the substrings >> (below) behave the way that they do, in re-ordering the underscores. >> >>> lxp=# SELECT substr(threadid,1,5) FROM test ORDER BY substr(threadid, 1, 5); >>> substr >>> -------- >>> ___-0 >>> ___-0 >>> 000-0 >>> 100-0 >>> 100-0 >>> 200-0 >>> (6 rows) >>> >>>> Now, the underscores appear to PRECEDE the 0's. This seems at least a >>> >>> little more sane, however this is completely the opposite of where the >>> underscore would be sorted with 7.0.3. Now consider the next substring, of >>> six characters instead of five. >>> >>> lxp=# SELECT substr(threadid,1,6) FROM test ORDER BY substr(threadid, 1,6); >>> substr >>> --------- >>> 000-09 >>> ___-09 >>> ___-09 >>> 100-09 >>> 100-09 >>> 200-09 >>> (6 rows) >> >> Thanks, though still puzzled, >> Jw @ Command Prompt. >> -- >> By way of pgsql-general@commandprompt.com >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
pgsql-general by date: