Thread: Re: Support regular expressions with nondeterministic collations

Re: Support regular expressions with nondeterministic collations

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> This patch allows using regular expression functions and operators with 
> nondeterministic collations.
> ...
> In summary, this patch doesn't change any functionality that currently 
> works.  It just removes one error message and lets regular expressions 
> just run, independent of whether the collation is nondeterministic.

I kind of wonder if we really want to do this.  It adds no
functionality, and it forecloses the possibility of changing
the definition later.  I understand and agree with your conclusion
that it's pretty much impossible to do what the SQL standard suggests
should happen --- but maybe we're both missing something that would
make it feasible.  (Have you asked your committee colleagues if
anyone's actually implemented what they wrote about SIMILAR TO?
If they've written something unimplementable, it seems like there
is work for them to do in any case.)

On the whole I'm content with our status quo here.

If we do push forward with this, I doubt that it's okay to throw
the error for SIMILAR TO from where you have it --- it will leak
the partially-built compiled regex, and that will be a
session-lifespan leak.  The way forward is illustrated by code
just above: it'd have to look more like

    if (!collation-is-allowed)
        return freev(v, REG_ECOLLATION);

where you'd need to invent a new regex error code REG_ECOLLATION
and plug that into the appropriate places.

            regards, tom lane



Re: Support regular expressions with nondeterministic collations

From
Peter Eisentraut
Date:
On 22.10.24 16:40, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> This patch allows using regular expression functions and operators with
>> nondeterministic collations.
>> ...
>> In summary, this patch doesn't change any functionality that currently
>> works.  It just removes one error message and lets regular expressions
>> just run, independent of whether the collation is nondeterministic.
> 
> I kind of wonder if we really want to do this.  It adds no
> functionality, and it forecloses the possibility of changing
> the definition later.  I understand and agree with your conclusion
> that it's pretty much impossible to do what the SQL standard suggests
> should happen --- but maybe we're both missing something that would
> make it feasible.  (Have you asked your committee colleagues if
> anyone's actually implemented what they wrote about SIMILAR TO?
> If they've written something unimplementable, it seems like there
> is work for them to do in any case.)

Good idea; I'll go ask there too.

Btw., one end goal here is to be able to run with a nondeterministic 
collation as the global locale.  So for example you could make the whole 
system insensitive to Unicode normalization forms.  But if that 
effectively globally disables regular expressions, then people will be 
sad, and also most of psql breaks, and so on.  So some positive solution 
here would be useful.




Re: Support regular expressions with nondeterministic collations

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 22.10.24 16:40, Tom Lane wrote:
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>> In summary, this patch doesn't change any functionality that currently
>>> works.  It just removes one error message and lets regular expressions
>>> just run, independent of whether the collation is nondeterministic.

>> I kind of wonder if we really want to do this.  It adds no
>> functionality, and it forecloses the possibility of changing
>> the definition later.

> Btw., one end goal here is to be able to run with a nondeterministic 
> collation as the global locale.  So for example you could make the whole 
> system insensitive to Unicode normalization forms.  But if that 
> effectively globally disables regular expressions, then people will be 
> sad, and also most of psql breaks, and so on.  So some positive solution 
> here would be useful.

Sure, and I'll support this patch once we're sure that no better
functionality is possible.  I just want to look into whether the
SQL committee knows something we don't.

            regards, tom lane