Thread: BUG #12542: Incorrect behaviour of lower and upper on accented vocals in UTF8

BUG #12542: Incorrect behaviour of lower and upper on accented vocals in UTF8

From
orsini@unive.it
Date:
The following bug has been logged on the website:

Bug reference:      12542
Logged by:          Renzo Orsini
Email address:      orsini@unive.it
PostgreSQL version: 9.3.5
Operating system:   Mac OS X
Description:

When lower and upper are applied to UTF8 strings with accented letters, they
have an incorrect behaviour, for instance, upper('Autorità') returns
'AUTORITà' and not 'AUTORITÀ' as it should. Similarly, lower('AUTORITÀ')
returns lower('autoritÀ').
orsini@unive.it writes:
> The following bug has been logged on the website:
> Bug reference:      12542
> Logged by:          Renzo Orsini
> Email address:      orsini@unive.it
> PostgreSQL version: 9.3.5
> Operating system:   Mac OS X
> Description:

> When lower and upper are applied to UTF8 strings with accented letters, they
> have an incorrect behaviour, for instance, upper('Autorità') returns
> 'AUTORITà' and not 'AUTORITÀ' as it should. Similarly, lower('AUTORITÀ')
> returns lower('autoritÀ').

Yeah, unfortunately, this is a bug in Mac OS X itself: the UTF8 locales
don't really work right.  You might have better luck if you can adopt an
ISO8859 encoding.

There has been some discussion of working around OS X's deficiencies
in this area, but it's a significant bit of work and hasn't been
done yet.

            regards, tom lane