Peter Eisentraut wrote:
> > If the Postgres default was bytewise sorting+locale-agnostic
> > ctype functions directly derived from Unicode data files,
> > as opposed to libc/$LANG at initdb time, the main
> > annoyance would be that "ORDER BY textcol" would no
> > longer be the human-favored sort.
>
> I think that would be a terrible direction to take, because it would
> regress the default sort order from "correct" to "useless". Aside from
> the overall message this sends about how PostgreSQL cares about
> locales and Unicode and such.
Well, offering a viable solution to avoid as much as possible
the dreaded:
"WARNING: collation "xyz" has version mismatch
... HINT: Rebuild all objects affected by this collation..."
that doesn't sound like a bad message to send.
Currently, to have in codepoint order the indexes that don't need a
linguistic order, you're supposed to use collate "C", which then means
that upper(), lower() etc.. don't work beyond ASCII.
Here our Unicode support is not good enough, and the proposal
addresses that.
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite