Home > mailing lists

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From	Martijn van Oosterhout
Subject	Re: Bug in UTF8-Validation Code?
Date	April 5, 2007 09:00:10
Msg-id	20070405115848.GB17587@svana.org Whole thread Raw
In response to	Re: Bug in UTF8-Validation Code? (Tatsuo Ishii <ishii@postgresql.org>)
Responses	Re: Bug in UTF8-Validation Code?
List	pgsql-hackers

Tree view

On Thu, Apr 05, 2007 at 09:34:25AM +0900, Tatsuo Ishii wrote:
> I'm not sure what kind of use case for unicode_char() you are thinking
> about. Anyway if you want a "code point" from a character, we could
> easily add such functions to all backend encodings currently we
> support. Probably it would look like:

I think the problem is that most encodings do not have the concept of a
code point anyway, so implementing it for them is fairly useless.

> An example outputs are:
>
> ASCII - 41
> ISO 10646 - U+0041
> ISO 10646 - U+29E3D
> ISO 8859-1 - a5
> JIS X 0208 - 4141

In every case other than Unicode you're doing the same thing as
encode/decode. Since we already have those functions, there's no need
to get chr/ascii to duplicate it. In the case of UTF-8 however, it does
something that is not done by encode/decode, hence the proposal to
simply extend chr/ascii to do that.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

From: "Simon Riggs"
Date: 05 April 2007, 08:31:57
Subject: Re: Auto Partitioning

From: Magnus Hagander
Date: 05 April 2007, 09:03:46
Subject: Buildfarm failures en masse

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

Previous

Next