Home > mailing lists

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From	Mark Dilger
Subject	Re: Bug in UTF8-Validation Code?
Date	April 3, 2007 13:55:34
Msg-id	46127702.8060100@markdilger.com Whole thread Raw
In response to	Re: Bug in UTF8-Validation Code? (Martijn van Oosterhout <kleptog@svana.org>)
Responses	Re: Bug in UTF8-Validation Code?
List	pgsql-hackers

Tree view

Martijn van Oosterhout wrote:
> On Tue, Apr 03, 2007 at 11:43:21AM +0200, Albe Laurenz wrote:
>> IMHO this is the only good and intuitive way for CHR() and ASCII().
> 
> Hardly. The comment earlier about mbtowc was much closer to the mark.
> And wide characters are defined as Unicode points.
> 
> Basically, CHR() takes a unicode point and returns that character
> in a string appropriately encoded. ASCII() does the reverse.
> 
> Just about every multibyte encoding other than Unicode has the problem
> of not distinguishing between the code point and the encoding of it.
> Unicode is a collection of encodings based on the same set.
> 
> Have a nice day,

Thanks for the feedback.  Would you say that the way I implemented things in the 
example code would be correct for multibyte non Unicode encodings?  I don't see 
how to avoid the endianness issue for those encodings.

mark

pgsql-hackers by date:

From: Zdenek Kotala
Date: 03 April 2007, 13:40:09
Subject: Re: Questions about pid file creation code

From: Josh Berkus
Date: 03 April 2007, 14:04:12
Subject: Re: Implicit casts to text

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

Previous

Next