Peter Eisentraut wrote:
> Another patch.
+ <literal>ks</literal> key), in order for such such collations to act in
a
s/such such/such/
+ <para>
+ The pattern matching operators of all three kinds do not support
+ nondeterministic collations. If required, apply a different collation
to
+ the expression to work around this limitation.
+ </para>
It's an important point of comparison between CI collations and
contrib/citext, since the latter diverts a bunch of functions/operators
to make them do case-insensitive pattern matching.
The doc for citext explains the rationale for using it versus text,
maybe it would need now to be expanded a bit with pros/cons of
choosing citext versus non-deterministic collations.
The current patch doesn't alter a few string functions that could
potentially implement collation-aware string search, such as
replace(), strpos(), starts_with().
ISTM that we should not let these functions ignore the collation: they
ought to error out until we get their implementation to use the ICU
collation-aware string search.
FWIW I've been experimenting with usearch_openFromCollator() and
other usearch_* functions, and it looks doable to implement at least the
3 above functions based on that, even though the UTF16-ness of the API
does not favor us.
ICU also provides regexp matching, but not collation-aware, since
character-based patterns don't play well with the concept of collation.
About a potential collation-aware LIKE, it looks hard to implement,
since the algorithm currently used in like_match.c seems purely
character-based. AFAICS there's no way to plug calls to usearch_*
functions into it, it would need a separate redesign from scratch.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite