On 05.06.25 21:56, Jeff Davis wrote:
> On Thu, 2025-06-05 at 10:12 +0200, Peter Eisentraut wrote:
>> The reason we don't do it at parse time is that we don't have the
>> information which functions care about collations, which is exactly
>> what
>> you are proposing here to add.
>
> Currently, we have:
>
> create table c(x text collate "C", y text collate "en_US");
> insert into c values ('x', 'y');
> select x < y from c; -- fails (runtime check)
> select x || y from c; -- succeeds
>
> Surely, "<" would be marked as ordering-sensitive, and we could move
> the error to parse-time.
>
> But what about UDFs? If we assume that all UDFs are ordering-sensitive
> unless marked otherwise, then a user-defined version of "||" that
> previously worked would now start failing, until they add the ordering-
> insensitive mark.
I think no matter how we slice it, there is going to be some case that
will be degraded until some update is applied. I would be content to
accept this particular variant, because it doesn't seem very realistic.
Why would a user define their own concatenation function? There already
is one. Unless your concatenation function does something special, in
which case you should probably think about this collations topic. More
generally, there are I think only so many operations you can do on
characters strings that you can do without considering the
collation/ctype/etc. These are essentially all the operations that you
can do without looking at the characters, like length(), ||, repeat().
Everything beyond that looks at the characters and needs to take
collation/ctype/etc. into account.
> We'd need some kind of migration path where we could retain the runtime
> checks and disable the parse time checks until people have a chance to
> add the right marks to their UDFs. Migration paths like that are not
> great because they take several releases to work out, and we're never
> quite sure when to finally remove the deprecated behavior.
Perhaps pg_dump can apply some properties during upgrades?
> If we make the opposite assumption, that none are ordering-sensitive
> unless we mark them so, that would allow properly-marked functions to
> fail at parse time, and the rest to fail at runtime. But this
> assumption doesn't work as well for recording dependencies, because
> we'd miss the dependencies for UDFs that aren't properly marked.
That feels like the worst of both worlds.