Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Bug in UTF8-Validation Code?
Date
Msg-id 45F6B167.8070401@dunslane.net
Whole thread Raw
In response to Re: Bug in UTF8-Validation Code?  (Mario Weilguni <mweilguni@sime.com>)
Responses Re: Bug in UTF8-Validation Code?  (Mario Weilguni <mweilguni@sime.com>)
List pgsql-hackers
Mario Weilguni wrote:
> Am Dienstag, 13. März 2007 14:46 schrieb Albe Laurenz:
>   
>> Mario Weilguni wrote:
>>     
>>> Steps to reproduce:
>>> create database testdb with encoding='UTF8';
>>> \c testdb
>>> create table test(x text);
>>> insert into test values ('\244'); ==> Is akzepted, even if not UTF8.
>>>       
>> This is working as expected, see the remark in
>> http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQ
>> L-SYNTAX-STRINGS
>>
>> "It is your responsibility that the byte sequences you create
>>  are valid characters in the server character set encoding."
>>     
>
> In that case, pg_dump is doing wrong here and should quote the output. IMO it 
> cannot be defined as working as expected, when this makes any database dumps 
> worthless, without any warnings at dump-time.
>
> pg_dump should output \244 itself in that case.
>
>   

The sentence quoted from the docs is perhaps less than a model of 
clarity. I would take it to mean that no client-encoding -> 
server-encoding translation will take place. Does it really mean that 
the server will happily accept any escaped byte sequence, whether or not 
it is valid for the server encoding? If so that seems ... odd.

cheers

andrew


pgsql-hackers by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Major Feature Interactions
Next
From: Richard Huxton
Date:
Subject: Re: My honours project - databases using dynamically attached entity-properties