On fre, 2010-05-28 at 20:22 +0300, Heikki Linnakangas wrote:
> It's also fundamentally wrong, collation is not a property of a datum
> but of the operation.
> One way to approach this is to realize that it's already possible to
> use
> multiple collations in a database. You just have to define separate <
> =
> > operators and operator classes for every collation, and change all
> your queries to use the right operator depending on the desired
> collation everywhere where you use < = > (including ORDER BYs, with
> the
> USING <operator> syntax). The behavior is exactly what we want, it's
> just completely inpractical, so we need something to do the same in a
> less cumbersome way.
Well, maybe we should step back a little and work out what sort of
feature we actually want, if any. The feature I'm thinking of is what
people might call "per-column locale", and the SQL standard defines
that. It would look like this:
CREATE TABLE test ( a varchar COLLATE de, b varchar COLLATE fr
);
SELECT * FROM test WHERE a > 'baz' ORDER BY b;
So while it's true that the collation is used by the operations (> and
ORDER BY), the information which collation to use comes with the data
values. It's basically saying, a is in language "de", so sort it like
that unless told otherwise. There is also an override syntax available,
like this:
SELECT * FROM test WHERE a COLLATE en > 'baz' ORDER BY b COLLATE sv;
But here again the collation is attached to a data value, and only from
there it is passed to the operator. What is actually happening is
SELECT * FROM test WHERE (a COLLATE en) > 'baz' ORDER BY (b COLLATE sv);
What you appear to be describing is a "per-operation locale", which also
sounds valid, but it would be a different thing. It might be thought of
as this:
SELECT * FROM test WHERE a (> COLLATE en) 'baz' ORDER BY COLLATE sv b;
with some suitable global default.
So which one of these should it be?