Home > mailing lists

Re: invalidly encoded strings - Mailing list pgsql-hackers

From	Andrew Dunstan
Subject	Re: invalidly encoded strings
Date	September 9, 2007 14:18:29
Msg-id	46E42ADF.7050007@dunslane.net Whole thread Raw
In response to	Re: invalidly encoded strings (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: invalidly encoded strings Re: invalidly encoded strings
List	pgsql-hackers

Tree view

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Is that going to cover data coming in via COPY? and parameters for 
>> prepared statements?
>>     
>
> Those should be checked already --- if not, the right fix is still to
> fix it there, not in per-datatype code.  I think we are OK though,
> eg see "need_transcoding" logic in copy.c.
>   

Well, a little experimentation shows that we currently are not OK:

in foo.data:
\366\66

utf8test=# \copy xx from foo.data
utf8test=# select encode(t::bytea,'hex') from xx;encode
--------f636
(1 row)

utf8test=# \copy xx to bb.data
utf8test=# \copy xx from bb.data
ERROR:  invalid byte sequence for encoding "UTF8": 0xf636
HINT:  This error can also happen if the byte sequence does not match 
the encoding expected by the server, which is controlled by 
"client_encoding".
CONTEXT:  COPY xx, line 1
utf8test=#

BTW, all the foo_recv functions that call pq_getmsgtext or 
pq_getmsgstring are thereby calling pg_verify_mbstr already (via 
pg_client_to_server). So I am still not 100% convinced that doing the 
same directly in the corresponding foo_in functions is such a bad idea.

cheers

andrew

pgsql-hackers by date:

From: Andrew Dunstan
Date: 09 September 2007, 14:03:45
Subject: Re: Are we done with sync-commit-defaults-to-off patch?

From: Jeff Davis
Date: 09 September 2007, 15:00:34
Subject: Re: invalidly encoded strings

Re: invalidly encoded strings - Mailing list pgsql-hackers

Previous

Next