Home > mailing lists

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From	Martijn van Oosterhout
Subject	Re: Bug in UTF8-Validation Code?
Date	March 18, 2007 08:38:59
Msg-id	20070318113622.GA5722@svana.org Whole thread Raw
In response to	Re: Bug in UTF8-Validation Code? (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: Bug in UTF8-Validation Code? Re: Bug in UTF8-Validation Code?
List	pgsql-hackers

Tree view

On Sat, Mar 17, 2007 at 11:46:01AM -0400, Andrew Dunstan wrote:
> How can we fix this? Frankly, the statement in the docs warning about
> making sure that escaped sequences are valid in the server encoding is a
> cop-out. We don't accept invalid data elsewhere, and this should be no
> different IMNSHO. I don't see why this should be any different from,
> say, date or numeric data. For years people have sneered at MySQL
> because it accepted dates like Feb 31st, and rightly so. But this seems
> to me to be like our own version of the same problem.

It seems to me that the easiest solution would be to forbid \x?? escape
sequences where it's greater than \x7F for UTF-8 server encodings.
Instead introduce a \u escape for specifying the unicode character
directly. Under the basic principle that any escape sequence still has
to represent a single character. The result can be multiple bytes, but
you don't have to check for consistancy anymore.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

From: Grzegorz Jaskiewicz
Date: 18 March 2007, 06:32:53
Subject: Re: Bug in UTF8-Validation Code?

From: Josh Berkus
Date: 18 March 2007, 09:09:49
Subject: Re: Project suggestion: benchmark utility for PostgreSQL

Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

Previous

Next