Re: Collations versus user-defined functions - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Collations versus user-defined functions
Date
Msg-id 4220.1299971193@sss.pgh.pa.us
Whole thread Raw
In response to Re: Collations versus user-defined functions  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Collations versus user-defined functions
List pgsql-hackers
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sat, Mar 12, 2011 at 02:46:19PM -0500, Tom Lane wrote:
>> This would actually seem more sensible if we went with something even
>> simpler than the current patch's behavior, namely that COLLATE only
>> affects the operator it is an *immediate* input of, and nothing
>> propagates upward in expressions ever.  I remain unconvinced that the
>> SQL spec is calling for propagation ...

> Well, it doesn't say in the general case, but there is under 6.29
> <string value function> Syntax rule 4b 

> 4) If <character substring function> CSF is specified, then let DTCVE
> be the declared type of the <character value expression> immediately
> contained in CSF. The maximum length, character set, and collation of
> the declared type DTCSF of CSF are determined as follows:

> b) The character set and collation of the <character substring
> function> are those of DTCVE.

> A similar wording is for the trim function. While obviously it doesn't
> cover all user defined functions, it seem obviously that once you do
> propegation for a few builtins you may as well do it for all of them.
> For the concatination operator is has something similar, though written
> in a way only a spec committe could come up with.

> Frankly, without propegation the feature seems entirely useless.

I remain unconvinced, because there are too many corner cases.  Should
collation propagate up out of a subselect?  How about a CTE?  You're
starting to get into some pretty weird action-at-a-distance situations
if so, analogous to the function-input-arguments case that you were just
saying should NOT propagate collation.  And I still don't see anything
in the text of the spec to justify it.

My feeling is that the feature would be simple, explainable, and useful
if COLLATE only affected the immediately syntactically-containing
operator.  The rest of this stuff requires a huge amount of mechanism
whose behavior will be nothing but surprising, even though it's
inflexible as can be (cf Greg's point about not being able to select
collation at runtime).  I'm not going to say it's the worst piece of
language design that's ever come out of the SQL committee, but I'm
starting to feel like it's in the top ten.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Aaron W. Swenson"
Date:
Subject: Re: pg_dump -X
Next
From: Noah Misch
Date:
Subject: Re: memory-related bugs