Re: Collations versus user-defined functions - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Collations versus user-defined functions
Date
Msg-id 20110313122522.GA16472@svana.org
Whole thread Raw
In response to Re: Collations versus user-defined functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Collations versus user-defined functions
Re: Collations versus user-defined functions
List pgsql-hackers
On Sat, Mar 12, 2011 at 06:06:33PM -0500, Tom Lane wrote:
> I remain unconvinced, because there are too many corner cases.  Should
> collation propagate up out of a subselect?  How about a CTE?  You're
> starting to get into some pretty weird action-at-a-distance situations
> if so, analogous to the function-input-arguments case that you were just
> saying should NOT propagate collation.  And I still don't see anything
> in the text of the spec to justify it.

I said don't propegate the collation *state*, the collation should be
propegated.

We propegate type information out of subqueries, we propegate
fieldnames, why not collation information? Once you consider the
collation a property of the type it becomes pretty obvious. I'll agree
the function-input-arguments is a bit odd, but the issue is not the
collation at all, but the collate *state*, which is something quite
different. But that's primarily (I think) because the SQL standard
doesn't have user defined functions (we'll there's PSM but it doesn't
consider the issue AFAICS).

If you feel that it shouldn't propegate into functions at all, it's a
soluton but I bet we'll get bug reports about it, because its totally
non-obvious. We get still complaints about not propegating typmod.

> My feeling is that the feature would be simple, explainable, and useful
> if COLLATE only affected the immediately syntactically-containing
> operator.  The rest of this stuff requires a huge amount of mechanism
> whose behavior will be nothing but surprising, even though it's
> inflexible as can be (cf Greg's point about not being able to select
> collation at runtime).  I'm not going to say it's the worst piece of
> language design that's ever come out of the SQL committee, but I'm
> starting to feel like it's in the top ten.

I'm going to have to disagree, I think that the solution they've come
up with using collations and collation state is quite neat and actually
does what people want. I've experimented with it and I haven't found
any situation where the results would be surprising. And easy to
implement, compared to the planner changes.

We don't let people change types at runtime, why would collations be
any different? Runtime sorting can be acheived with strxfrm.

In any case, you don't need the propegation for COLLATE expressions,
because they will be rare. You primarily need it for implicit collation
propegation. ISTM that doing collation state propegation for everything
except explicit COLLATE expressions is about the most surprising
solution of all.

What you're suggesting is going to lead to situations where the user
sets a non-default collation on every field in every table in the
database and depending on the query they will sometimes get the default
collation anyway.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: memory-related bugs
Next
From: Robert Haas
Date:
Subject: Re: Collations versus user-defined functions