Home > mailing lists

Re: Add CASEFOLD() function. - Mailing list pgsql-hackers

From	Ian Lawrence Barwick
Subject	Re: Add CASEFOLD() function.
Date	December 12 18:52:31
Msg-id	CAB8KJ=jgnpbDTJS+jOQMzo7b1mPTwM0hCqQBBRgoX_kvyxWVog@mail.gmail.com Whole thread Raw
Responses	Re: Add CASEFOLD() function.
List	pgsql-hackers

Tree view

Hi

2024年12月12日(木) 18:00 Jeff Davis <pgsql@j-davis.com>:
>
> Unicode case folding is a way to convert a string to a canonical case
> for the purpose of case-insensitive matching.
>
> Users have long used LOWER() for that purpose, but there are a few edge
> case problems:
>
> * Some characters have more than two cased forms, such as "Σ" (U+03A3),
> which can be lowercased as "σ" (U+03C3) or "ς" (U+03C2). The CASEFOLD()
> function converts all cased forms of the character to "σ".
>
> * The character "İ" (U+0130, capital I with dot) is lowercased to "i",
> which can be a problem in locales that don't expect that.
>
> * If new lower case characters are added to Unicode, the results of
> LOWER() may change.
>
> The CASEFOLD() function solves these problems.
>
> Patch attached.

I took a quick look at this as it sounds useful for the described issue,
and it seems to work as advertised, except the function is named "FOLDCASE()"
in the patch, so I'm wondering which is intended? A quick search indicates
there are no functions of either name in other databases; Python has a
"casefold()"
function [1] and PHP a "foldCase()" function [2], so it doesn't seem there's a
de-facto standard for this.

[1] https://docs.python.org/3/library/stdtypes.html#str.casefold
[2] https://www.php.net/manual/en/intlchar.foldcase.php

Regards


Ian Barwick

pgsql-hackers by date:

From: Melanie Plageman
Date: 12 December, 18:48:47
Subject: Re: Wrong results with right-semi-joins

From: Dagfinn Ilmari Mannsåker
Date: 12 December, 19:17:48
Subject: Re: pg_createsubscriber TAP test wrapping makes command options hard to read.

Re: Add CASEFOLD() function. - Mailing list pgsql-hackers

Previous

Next