Re: lower and upper not UTF-8 safe - Mailing list pgsql-hackers

From Tom Lane
Subject Re: lower and upper not UTF-8 safe
Date
Msg-id 11538.1060030982@sss.pgh.pa.us
Whole thread Raw
In response to lower and upper not UTF-8 safe  (Julian Satchell <j.satchell@eris.qinetiq.com>)
Responses Re: lower and upper not UTF-8 safe
List pgsql-hackers
Julian Satchell <j.satchell@eris.qinetiq.com> writes:
> The implementations of lower and upper in
> src/backend/utils/adt/oracle_compat.c use the single byte macros from
> ctype.h to alter individual bytes in the text string. 

> If the text is UTF-8 encoded this is totally wrong, and will result in
> an invalid string that is no longer UTF-8.

Only if you use a locale that is assuming a character set that is not
UTF8 but does have characters with the high bit set.  I'm not sure that
we can do anything to defend against locale/charset mismatch.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Julian Satchell
Date:
Subject: lower and upper not UTF-8 safe
Next
From: Bruce Momjian
Date:
Subject: Re: Thread-safe configuration option appears to