Giles,
On Thu, 22 Jun 2000 11:12:54 +1000, Giles Lean wrote:
>Yes. Some locales want strings to be ordered first by ignoring any
>accents on chracters, then using a tie-break on equal strings by doing
>a comparison that includes the accents.
I guess I don't see how this is really any different. Why order first by the character and second by the accent? For
instance,
if you know the relative order of the various forms of "o" then just give them all successive numbers and do a single
pass
sort. You just have to make sure that all the numbers in that set of numbers are greater than the number you assign to
"m"
and less than the number you assign to "p".
>To take another of your points out of order: this is an obstacle that
>Unicode doesn't resolve. Unicode gives you a character set capable of
>representing characters from many different locales, but collation
>order will remain locale specific.
With Unicode you have to have a collation order that cuts across what use to be separate character sets in separate
code
pages.
>... but due to the increased memory/disk space, this is likely not an
>optimisation. Measurements needed, I'd suggest.
But why is there increased memory and disk space? Do the fields that go into an index not now already get stored twice?
Does the index just contain a series of references to records and that is it?