Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty - Mailing list pgsql-hackers

From Tom Lane
Subject Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date
Msg-id 25342.1299797491@sss.pgh.pa.us
Whole thread Raw
In response to Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
List pgsql-hackers
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Thu, Mar 10, 2011 at 05:16:52PM -0500, Tom Lane wrote:
>> On the other hand ... one thing that's been bothering me is that
>> select_common_collation assumes that "explicit" collation derivation
>> doesn't bubble up in the tree, ie a COLLATE is only a forcing function
>> for the immediate parent expression node.  It's not at all clear to me
>> that that's a correct reading of the spec.

> I beleive the current interpretation, that is the COLLATE only applies
> to levels above, is the correct interpretation. COLLATE binds tightly,
> so

> A op B COLLATE C  parses as  A op (B COLLATE C)

> which is why it works.

No, that's not what I'm on about.  Consider
(((A COLLATE X) || B) || (C COLLATE Y)) < (D COLLATE Z)

(I've spelled out the parenthesization in full for clarity, but most
of these parens could be omitted.)  Is this expression legal, or
should the "<" operator be throwing an error for conflicting
explicitly-derived collations?  Our code as it stands will take it,
because no individual operator sees more than one COLLATE among its
arguments.  But I'm not sure this is right.  The only text I can find
in SQL2008 that seems to bear on the point is in 4.2.2:
Anything that has a declared type can, if that type is acharacter string type, be associated with a collation
applicabletoits character set; this is known as a declared typecollation. Every declared type that is a character
stringtypehas a collation derivation, this being either none, implicit, orexplicit. The collation derivation of a
declaredtype with adeclared type collation that is explicitly or implicitlyspecified by a <data type> is implicit. If
thecollationderivation of a declared type that has a declared type collationis not implicit, then it is explicit. The
collationderivationof an expression of character string type that has no declaredtype collation is none.
 

As I read this, the collation attached to any Var clause is implicit
(because it came from the Var's data type), and the collation attached
to a CollateClause is presumably explicit, but where does it say what
happens at higher levels in the expression tree?  It's at least arguable
that the result collation of an expression is explicit if its input
collation was explicit.  The fact that the default in case of doubt
apparently is supposed to be "explicit" doesn't give any aid or comfort
to your position either.  If explicitness comes only from the immediate
use of COLLATE, why don't they say that?  This is worded to make one
think that most cases will have explicit derivation, not only COLLATE.

I wonder if anyone can check the behavior of nested collate clauses in
DB2 or some other probably-spec-conforming database.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Next
From: Tom Lane
Date:
Subject: Can't specify default collation?