Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date
Msg-id 20110311212015.GA27117@svana.org
Whole thread Raw
In response to Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
List pgsql-hackers
On Thu, Mar 10, 2011 at 05:51:31PM -0500, Tom Lane wrote:
> No, that's not what I'm on about.  Consider
>
>     (((A COLLATE X) || B) || (C COLLATE Y)) < (D COLLATE Z)
>
> (I've spelled out the parenthesization in full for clarity, but most
> of these parens could be omitted.)  Is this expression legal, or
> should the "<" operator be throwing an error for conflicting
> explicitly-derived collations?  Our code as it stands will take it,
> because no individual operator sees more than one COLLATE among its
> arguments.  But I'm not sure this is right.  The only text I can find
> in SQL2008 that seems to bear on the point is in 4.2.2:

The rules are essentially as described here:

http://msdn.microsoft.com/en-us/library/ms179886.aspx

So:

(A COLLATE X)     => collation X
((A COLLATE X) || B)   => collation X
(((A COLLATE X) || B) || (C COLLATE Y))  => error

If we aren't erroring on this then we're doing it wrong. The whole
point of going through the parse tree and assigning a collation to each
node is to catch these things.

> As I read this, the collation attached to any Var clause is implicit
> (because it came from the Var's data type), and the collation attached
> to a CollateClause is presumably explicit, but where does it say what
> happens at higher levels in the expression tree?  It's at least arguable
> that the result collation of an expression is explicit if its input
> collation was explicit.  The fact that the default in case of doubt
> apparently is supposed to be "explicit" doesn't give any aid or comfort
> to your position either.  If explicitness comes only from the immediate
> use of COLLATE, why don't they say that?  This is worded to make one
> think that most cases will have explicit derivation, not only COLLATE.

See 9.3 "Data types of results of aggregations" clause (ii). It
contains essentially the rules outlined by the Transact-SQL page above.
   The collation derivation and declared type collation of the result are   determined as follows.   Case:   1) If some
datatype in DTS has an explicit collation derivation and   declared type collation   EC1, then every data type in DTS
thathas an explicit collation   derivation shall have a declared   type collation that is EC1. The collation derivation
isexplicit and   the collation is EC1.   2) If every data type in DTS has an implicit collation derivation, then
Case:  A) If every data type in DTS has the same declared type collation IC1,   then the collation   derivation is
implicitand the declared type collation is IC1.   B) Otherwise, the collation derivation is none.   3) Otherwise, the
collationderivation is none. 

In my implementation I needed to expand this to the general set of
operators postgresql supported and relaxed this to only consider
arguments to the function/operator that had the same type as the
resulting type of the function/operator, since that's the only thing
that makes sense.

A concatination then requires its arguments to be compatible. A substr
has the collation of its sole string argument.

I hope this helps,

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Add missing keywords to gram.y's unreserved_keywords list.
Next
From: Tom Lane
Date:
Subject: Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty