Re: Collations versus user-defined functions - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Collations versus user-defined functions
Date
Msg-id 20110312214058.GD4380@svana.org
Whole thread Raw
In response to Re: Collations versus user-defined functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Collations versus user-defined functions
Re: Collations versus user-defined functions
List pgsql-hackers
On Sat, Mar 12, 2011 at 02:46:19PM -0500, Tom Lane wrote:
> > Similarly, inside the function the parameters should be considered to
> > be IMPLICIT collation, to avoid strange errors depending on how its
> > called.
>
> Not convinced by this.  If we say that that's how it works, then no
> user-defined function should react to COLLATE in its arguments at all,
> which seems pretty weird and restrictive --- especially if the COLLATE
> property is expected to propagate up through the function call so
> far as the calling expression is concerned.  It seems just bizarre to
> me to say that a function's internal operations don't react to an
> input collation spec but then its result is thought to still be affected
> by that.

I think I didn't explain myself well. The *state* should be implicit,
the actual collation should be whatever the query says. What I was
thinking of is the following:

CREATE FUNCTION my_english_lt(text, text) RETURNS boolean AS $$  return $1 < $2 COLLATE "en_US"
$$;

(not sure about the syntax but you get the idea).

If you just propegate naively you would get:

my_english_lt(x COLLATE "de_DE", y)   -> error, conflicting collation
my_english_ly(x, y COLLATE "de_DE")   -> would work fine

Hence my suggestion that on input to the function the parameters would
be considered collation "de_DE" state IMPLICIT, so the collation in the
function overrides, but if the COLLATE in the function is removed, the
implicit collation takes hold.

> This would actually seem more sensible if we went with something even
> simpler than the current patch's behavior, namely that COLLATE only
> affects the operator it is an *immediate* input of, and nothing
> propagates upward in expressions ever.  I remain unconvinced that the
> SQL spec is calling for propagation ...

Well, it doesn't say in the general case, but there is under 6.29
<string value function> Syntax rule 4b

4) If <character substring function> CSF is specified, then let DTCVE
be the declared type of the <character value expression> immediately
contained in CSF. The maximum length, character set, and collation of
the declared type DTCSF of CSF are determined as follows:

b) The character set and collation of the <character substring
function> are those of DTCVE.

A similar wording is for the trim function. While obviously it doesn't
cover all user defined functions, it seem obviously that once you do
propegation for a few builtins you may as well do it for all of them.
For the concatination operator is has something similar, though written
in a way only a spec committe could come up with.

Frankly, without propegation the feature seems entirely useless. Almost
all collations are going to be defined by implicit collations attached
to columns. If

ORDER BY x

and

ORDER BY x || 'foo'

Don't use the same collation then that is a first grade violation of
the POLA.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: template0 database comment
Next
From: Andrew Dunstan
Date:
Subject: Re: template0 database comment