Re: invalid multibyte character for locale - Mailing list pgsql-admin

From Bjoern Metzdorf
Subject Re: invalid multibyte character for locale
Date
Msg-id 421E0D6B.6010303@turtle-entertainment.de
Whole thread Raw
In response to Re: invalid multibyte character for locale  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
Tom Lane wrote:
> I don't know what behavior you thought you were getting from upper/lower
> on UTF-8 data in 7.4, but it was surely not correct.  If you want to
> duplicate that misbehavior, try SQL_ASCII with C locale.  This does not
> stop you from storing UTF-8 in your database, mind you --- it just
> loses validation of encoding sequences and conversion to other schemes.

> But having said that, upper() should work if the locale matches the
> encoding.  You might take the trouble to trace down exactly what data
> value it's barfing on.

I want to keep UNICODE encoding in any case.

So you say, that 7.x just did not cope at all with multibyte chars and
upper() and lower() spit out what the C functions toupper and tolower
spit out?

I also want to stay with locale C, because of the speed. I have
different languages, not only one specific, so changing the locale would
not help at all.

I assume I could just remove

#define USE_WIDE_UPPER_LOWER

from oracle_compat.c to emulate the old behaviour. But a cleaner fix
would be to check if we are using UNICODE and locale is C or POSIX and
only then skip USE_WIDE_UPPER_LOWER.

Comments?

Regards,
Bjoern




pgsql-admin by date:

Previous
From: "Joel Fradkin"
Date:
Subject: Re: invalid multibyte character for locale
Next
From: "Goulet, Dick"
Date:
Subject: Re: Preventing changes to default settings of a collective account?