Re: Remaining dependency on setlocale() - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Remaining dependency on setlocale()
Date
Msg-id ac83c2376d4ce2ad423f48c4fe2c416fc5d2b346.camel@j-davis.com
Whole thread Raw
In response to Re: Remaining dependency on setlocale()  (Peter Eisentraut <peter@eisentraut.org>)
List pgsql-hackers
On Wed, 2025-12-17 at 11:39 +0100, Peter Eisentraut wrote:
> For Metaphone, I found the reference implementation linked from its
> Wikipedia page, and it looks like our implementation is pretty
> closely
> aligned to that.  That reference implementation also contains the
> C-with-cedilla case explicitly.  The correct fix here would probably
> be
> to change the implementation to work on wide characters.  But I think
> for the moment you could try a shortcut like, use pg_ascii_toupper(),
> but if the encoding is LATIN1 (or LATIN9 or whichever other encodings
> also contain C-with-cedilla at that code point), then explicitly
> uppercase that one as well.  This would preserve the existing
> behavior.

Done, attached new patches.

Interestingly, WIN1256 encodes only the SMALL LETTER C WITH CEDILLA. I
think, for the purposes here, we can still consider it to "uppercase"
to \xc7, so that it can still be treated as the same sound. Technically
I think that would be an improvement over the current code in this edge
case, and suggests that case folding would be a better approach than
uppercasing.

Regards,
    Jeff Davis


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: NLS: use gettext() to translate system error messages
Next
From: Tom Lane
Date:
Subject: Re: NLS: use gettext() to translate system error messages