Re: BUG #3697: utf8 issue: can not reimport a table that was successfully exported. - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #3697: utf8 issue: can not reimport a table that was successfully exported.
Date
Msg-id 17114.1193328451@sss.pgh.pa.us
Whole thread Raw
In response to BUG #3697: utf8 issue: can not reimport a table that was successfully exported.  ("Marc Mamin" <m.mamin@intershop.de>)
Responses Re: BUG #3697: utf8 issue: can not reimport a table that was successfully exported.
List pgsql-bugs
"Marc Mamin" <m.mamin@intershop.de> writes:
> I didn't check if all characters are valid UTF8...

They aren't ...

> select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind
> \xE0\xF0\xF1\xF2\xE2\xE5\xED\xED\xFB\xE9 \xE2\xFB\xF1\xF8\9
> \xE3\xEE\xF1\xF3\xE4
> xE4\xE6 \xCD\xC1 \xD0\xC1")');

In 8.3 that will throw an error:

utf8=# select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind
utf8'# \xE0\xF0\xF1\xF2\xE2\xE5\xED\xED\xFB\xE9 \xE2\xFB\xF1\xF8\9
utf8'# \xE3\xEE\xF1\xF3\xE4
utf8'# xE4\xE6 \xCD\xC1 \xD0\xC1")');
WARNING:  nonstandard use of escape in a string literal
LINE 1: select f_utf8_test('(Mozilla/4.0 (compatible; MSIE 6.0; Wind
                           ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
ERROR:  invalid byte sequence for encoding "UTF8": 0xe0f0f1
HINT:  This error can also happen if the byte sequence does not match the encoding expected by the server, which is
controlledby "client_encoding". 
utf8=#

However, since this behavior isn't backwards-compatible, there's not
much appetite for back-patching it.

I don't think this is a security issue --- if you don't quote
backslashes in untrusted input you'll have problems far worse
than this one.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "Gergely Bor"
Date:
Subject: Re: Yet another problem with ILIKE and UTF-8
Next
From: Tom Lane
Date:
Subject: Re: BUG #3696: FK integrity check bypassed using rules.