pgsql: Support text position search functions with nondeterministic col - Mailing list pgsql-committers

From Peter Eisentraut
Subject pgsql: Support text position search functions with nondeterministic col
Date
Msg-id E1tlREg-000SRA-0A@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Support text position search functions with nondeterministic collations

This allows using text position search functions with nondeterministic
collations.  These functions are

- position, strpos
- replace
- split_part
- string_to_array
- string_to_table

which all use common internal infrastructure.

There was previously no internal implementation of this, so it was met
with a not-supported error.  This adds the internal implementation and
removes the error.

Unlike with deterministic collations, the search cannot use any
byte-by-byte optimized techniques but has to go substring by
substring.  We also need to consider that the found match could have a
different length than the needle and that there could be substrings of
different length matching at a position.  In most cases, we need to
find the longest such substring (greedy semantics), but this can be
configured by each caller.

Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400@eisentraut.org

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/329304c9012b2ac6d906afeb18062f9080dceef9

Modified Files
--------------
src/backend/utils/adt/varlena.c                | 104 ++++++++++++++---
src/test/regress/expected/collate.icu.utf8.out | 154 ++++++++++++++++++++-----
src/test/regress/sql/collate.icu.utf8.sql      |  36 +++++-
3 files changed, 246 insertions(+), 48 deletions(-)


pgsql-committers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: pgsql: doc: Add links to olsen93 and ong90 in bibliography
Next
From: Michael Paquier
Date:
Subject: pgsql: Fix cross-version upgrades with XMLSERIALIZE(NO INDENT)