Dear Tom,
Thank you for your explanation. It's very helpful, although I was
extremely surprised! I agree, it's not a postgresql bug.
Can I suggest it might be worth a mention on the "Order By" part of the
documentation.
i.e. this page:
http://www.postgresql.org/docs/7.3/static/sql-select.html#SQL-ORDERBY
could possibly use a little more emphasis of this last paragraph:
Data of character types is sorted according to the locale-specific
collation order that was established when the database cluster was
initialized.
or perhaps a link to here:
http://www.postgresql.org/docs/7.3/static/charset.html#AEN21582
I did realise that the sort would be locale dependent, but failed to
realise it wasn't byte-at-a-time.
Best wishes
Richard
Tom Lane wrote:
> Richard Neill <rn214@hermes.cam.ac.uk> writes:
>
>>This ordering is perverse!
>
>
> No kidding.
>
>
>>No matter what the priority is of the
>>different characters, I cannot understand how the above can arise.
>
>
> You are assuming that it's a byte-at-a-time process. It's not. I
> believe the first pass considers only letters and digits.
>
> You can easily prove to yourself that it's not just Postgres. Here's
> an example on my Linux laptop:
>
> [tgl@g3 tgl]$ cat zzz
> Cymbal #1
> Cymbal - 18 inch
> Cymbal #2
> [tgl@g3 tgl]$ LC_ALL=C sort zzz
> Cymbal #1
> Cymbal #2
> Cymbal - 18 inch
> [tgl@g3 tgl]$ LC_ALL=en_GB sort zzz
> Cymbal #1
> Cymbal - 18 inch
> Cymbal #2
> [tgl@g3 tgl]$
>
> regards, tom lane
>
--
rn214@hermes.cam.ac.uk ** http://www.richardneill.org
Richard Neill, Trinity College, Cambridge, CB21TQ, U.K.