Home > mailing lists

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From	Peter Eisentraut
Subject	Re: Bug in UTF8-Validation Code?
Date	March 14, 2007 09:05:43
Msg-id	200703141005.33119.peter_e@gmx.net Whole thread Raw
In response to	Re: Bug in UTF8-Validation Code? (Michael Paesold <mpaesold@gmx.at>)
List	pgsql-hackers

Tree view

Am Mittwoch, 14. März 2007 08:01 schrieb Michael Paesold:
> Is there anything in the SQL spec that asks for such a behaviour? I guess
> not.

I think that the octal escapes are a holdover from the single-byte days where 
they were simply a way to enter characters that are difficult to find on a 
keyboard.  In today's multi-encoding world, it would make more sense if there 
were an escape sequence for a *codepoint* which is then converted to the 
actual encoding (if possible and valid) in the server.  The meaning of 
codepoint is, however, character set dependent as well.

The SQL standard supports escape sequences for Unicode codepoints, which I 
think would be a very useful feature (try entering a UTF-8 character 
bytewise ...), but it's a bit weird to implement and it's not clear how to 
handle character sets other than Unicode.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

pgsql-hackers by date:

From: tomas@tuxteam.de
Date: 14 March 2007, 08:03:07
Subject: Re: My honours project - databases using dynamically attached entity-properties

From: "Zeugswetter Andreas ADI SD"
Date: 14 March 2007, 09:22:26
Subject: Re: Synchronized Scan update

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

Previous

Next