Tom Lane writes:
> I think that a more general solution would be the ability to select a
> locale (and hence a sort order) per-column, as the SQL spec envisions.
It is a general solution, but not for this problem. The problem was to
make all locales equally suitable for certain optimizations, not to make
locales available in more places. I won't pretend to anyone that this
little change will bring us anywhere closer to a solution for that other
problem.
> Then you'd just select C locale for columns you wanted to do pattern
> matching for.
That's wrong, for a number of reasons:
First of all, I don't agree at all that cases where you want both pattern
matching and collation are rare; in fact, I rarely see a case where you
don't want both. Designing a system on that assumption is not sound,
because all operations should be equally possible in all situations.
Second, we will eventually want pattern matching operations to be locale
aware. Case-sensitive matching needs this, because case mappings depend
on the locale. The character class features of POSIX regexps also need
this. So you cannot make locales and well-performing pattern matching
mutually exclusive.
Third, keep in mind that datums with different locales cannot be combined
liberally. So systems built the way you propose become crippled in ways
that will be hard to understand and justify.
Finally, the locale of a datum should be a property that describes that
language of the stored data and that can be used for that specific purpose
without concerns and tradeoffs with the internal doings of the
optimization engine.
--
Peter Eisentraut peter_e@gmx.net