Thread: BUG #16333: position() function not equivalent to strpos() function when comparing citext

BUG #16333: position() function not equivalent to strpos() function when comparing citext

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16333
Logged by:          Tyler Reynolds
Email address:      me@reynolds.tj
PostgreSQL version: 9.6.17
Operating system:   Windows 10
Description:

To reproduce:

CREATE DATABASE test;
CREATE EXTENSION citext;

SELECT position('foo'::citext IN 'Foobar'::citext) =
strpos('Foobar'::citext, 'foo'::citext) as "positionEqualsStrpos";

Expected: "positionEqualsStrpos" returns TRUE.
Actual: "positionEqualsStrpos" returns FALSE.

Citext does not create an overload for position() supporting citext
parameters, therefore position(a in b) always runs case-sensitive.


On Wednesday, April 1, 2020, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      16333
Logged by:          Tyler Reynolds
Email address:      me@reynolds.tj
PostgreSQL version: 9.6.17
Operating system:   Windows 10
Description:       

To reproduce:

CREATE DATABASE test;
CREATE EXTENSION citext;

SELECT position('foo'::citext IN 'Foobar'::citext) =
strpos('Foobar'::citext, 'foo'::citext) as "positionEqualsStrpos";

Expected: "positionEqualsStrpos" returns TRUE.
Actual: "positionEqualsStrpos" returns FALSE.

Citext does not create an overload for position() supporting citext
parameters, therefore position(a in b) always runs case-sensitive.

Nor does it claim to so this isn’t a bug.  The fact that citext doesn’t work with the SQL functions seems intentional.

David J.

PG Bug reporting form <noreply@postgresql.org> writes:
> SELECT position('foo'::citext IN 'Foobar'::citext) =
> strpos('Foobar'::citext, 'foo'::citext) as "positionEqualsStrpos";

> Citext does not create an overload for position() supporting citext
> parameters, therefore position(a in b) always runs case-sensitive.

Well, the citext documentation specifies which functions have
case-insensitive mappings.  strpos() is listed, position() is not,
so I'd say it's acting precisely as documented.

We could consider adding position() of course, but there's a
backwards-compatibility issue --- at this point, people may have
queries that depend on the current behavior.

            regards, tom lane



On 2020-04-02 03:04, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>> SELECT position('foo'::citext IN 'Foobar'::citext) =
>> strpos('Foobar'::citext, 'foo'::citext) as "positionEqualsStrpos";
> 
>> Citext does not create an overload for position() supporting citext
>> parameters, therefore position(a in b) always runs case-sensitive.
> 
> Well, the citext documentation specifies which functions have
> case-insensitive mappings.  strpos() is listed, position() is not,
> so I'd say it's acting precisely as documented.

Arguably, there is a misdesign here, however.  Any function that does 
some kind of text-in-text search where citext could plausbily offer 
case-insensitive behavior will automatically fall back to the 
case-sensitive version if citext doesn't offer its own variant.  The fix 
would technically need to be that citext offers its own variant of every 
potential such function, which is clearly not possible, or that casts 
between text and citext are more restricted, which would make citext 
nearly unusable.

Doesn't seem fixable.  Collations are probably a better way of dealing 
with this.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Arguably, there is a misdesign here, however.  Any function that does 
> some kind of text-in-text search where citext could plausbily offer 
> case-insensitive behavior will automatically fall back to the 
> case-sensitive version if citext doesn't offer its own variant.  The fix 
> would technically need to be that citext offers its own variant of every 
> potential such function, which is clearly not possible, or that casts 
> between text and citext are more restricted, which would make citext 
> nearly unusable.

Indeed.  There are some hundreds of built-in functions that take one
or more text arguments; how many of them would need citext variants?

> Doesn't seem fixable.  Collations are probably a better way of dealing 
> with this.

Yeah, now that we can do non-deterministic collations, it seems like
citext is on the road to obsolescence.  Do we have a documentation
example of how to build a simple CI collation?

            regards, tom lane



On 2020-04-02 23:46, Tom Lane wrote:
>> Doesn't seem fixable.  Collations are probably a better way of dealing
>> with this.
> 
> Yeah, now that we can do non-deterministic collations, it seems like
> citext is on the road to obsolescence.  Do we have a documentation
> example of how to build a simple CI collation?

Yes, in PG12 there is a "tip" box in the citext chapter with a link to 
the collation documentation which contains a command to create a 
case-insensitive collation.  So this should be pretty easy to find even 
for casual users.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services