Re: tiny step toward threading: reduce dependence on setlocale() - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: tiny step toward threading: reduce dependence on setlocale()
Date
Msg-id b961acfc-341c-4693-b944-2dddc9dcfdc9@eisentraut.org
Whole thread Raw
In response to Re: tiny step toward threading: reduce dependence on setlocale()  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: tiny step toward threading: reduce dependence on setlocale()
List pgsql-hackers
On 07.08.24 22:44, Peter Eisentraut wrote:
> (Now that I look at it, pg_tolower() has some short-circuiting for ASCII 
> letters, so it would not handle Turkish-i correctly if that had been the 
> global locale.  By removing the use of pg_tolower(), we fix that issue 
> in passing.)

It occurred to me that this issue also surfaces in a more prominent 
place.  These arguably-wrong pg_tolower() and pg_toupper() calls were 
also used by the normal SQL lower() and upper() functions before commit 
e9931bfb751 if you used a single byte encoding.

For example, in PG17, multi-byte encoding:

initdb --locale=tr_TR.utf8

select upper('hij'); --> HİJ

PG17, single-byte encoding:

initdb --locale=tr_TR  # uses LATIN5

select upper('hij'); --> HIJ

With current master, after commit e9931bfb751, you get the first result 
in both cases.

So this could break indexes across pg_upgrade in such configurations.




pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Restart pg_usleep when interrupted
Next
From: Peter Eisentraut
Date:
Subject: Re: tiny step toward threading: reduce dependence on setlocale()