Home > mailing lists

Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS
Date	June 9, 2011 11:11:46
Msg-id	BANLkTinP9XBcPuEK=7XPqh=NcOVFBKYDUw@mail.gmail.com Whole thread
In response to	Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS
List	pgsql-hackers

Tree view

On Thu, Jun 9, 2011 at 10:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> But now that I re-think about it, I guess what I'm confused about is
>> this code here:
>
>>                 if (ch >= 'A' && ch <= 'Z')
>>                         ch += 'a' - 'A';
>>                 else if (IS_HIGHBIT_SET(ch) && isupper(ch))
>>                         ch = tolower(ch);
>>                 result[i] = (char) ch;
>
> The expected behavior there is that case-folding of non-ASCII characters
> will occur in single-byte encodings but nothing will happen to
> multi-byte characters.  We are relying on isupper() to not return true
> when presented with a character fragment in a multibyte locale.

Based on Jeevan's original message, it seems like that's not always
the case, at least on Windows.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Tom Lane
Date: 09 June 2011, 11:07:54
Subject: Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS

From: Tom Lane
Date: 09 June 2011, 11:15:23
Subject: Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS

Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS - Mailing list pgsql-hackers

Previous

Next