Mario Weilguni wrote:
>>> Steps to reproduce:
>>> create database testdb with encoding='UTF8';
>>> \c testdb
>>> create table test(x text);
>>> insert into test values ('\244'); ==> Is akzepted, even if not UTF8.
>>
>> This is working as expected, see the remark in
>>
>> http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS
>>
>> "It is your responsibility that the byte sequences you create
>> are valid characters in the server character set encoding."
>
> In that case, pg_dump is doing wrong here and should quote the output. IMO it
> cannot be defined as working as expected, when this makes any database dumps
> worthless, without any warnings at dump-time.
>
> pg_dump should output \244 itself in that case.
True. Here is a test case on 8.2.3
(OS, database and client all use UTF8):
test=> CREATE TABLE test(x text);
CREATE TABLE
test=> INSERT INTO test VALUES ('correct: ä');
INSERT 0 1
test=> INSERT INTO test VALUES (E'incorrect: \244');
INSERT 0 1
test=> \q
laurenz:~> pg_dump -d -t test -f test.sql
Here is an excerpt from 'od -c test.sql':
0001040 e n z \n - - \n \n I N S E R T I
0001060 N T O t e s t V A L U E S
0001100 ( ' c o r r e c t : 303 244 ' ) ;
0001120 \n I N S E R T I N T O t e s
0001140 t V A L U E S ( ' i n c o r
0001160 r e c t : 244 ' ) ; \n \n \n - - \n
The invalid character (octal 244) is in the INSERT statement!
This makes psql gag:
test=> DROP TABLE test;
DROP TABLE
test=> \i test.sql
SET
SET
SET
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
INSERT 0 1
psql:test.sql:33: ERROR: invalid byte sequence for encoding "UTF8": 0xa4
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is
controlledby "client_encoding".
A fix could be either that the server checks escape sequences for validity
or that pg_dump outputs invalid bytes as escape sequences.
Or pg_dump could stop with an error.
I think that the cleanest way would be the first.
Yours,
Laurenz Albe