Re: Why don't I get a LATIN1 encoding here with SET ENCODING? - Mailing list pgsql-sql

From Craig Ringer
Subject Re: Why don't I get a LATIN1 encoding here with SET ENCODING?
Date
Msg-id 4AF0E70F.1030201@postnewspapers.com.au
Whole thread Raw
In response to Why don't I get a LATIN1 encoding here with SET ENCODING?  (Bryce Nesbitt <bryce2@obviously.com>)
Responses Re: Why don't I get a LATIN1 encoding here with SET ENCODING?  (Bryce Nesbitt <bryce2@obviously.com>)
List pgsql-sql
Bryce Nesbitt wrote:
> I'm tracking another bug, but wanted to verify stuff on the command line.  I 
> can't figure out why this did not work:

> dblack3-deleteme=> insert into bryce1 values(1,2,'test\375');
> ERROR:  invalid byte sequence for encoding "UTF8": 0xfd

I'd say the server is interpreting your query text as latin-1 and
converting it to the server encoding UTF-8 as it should, resulting in
the utf-8 string:
 insert into bryce1 values(1,2,'test\375');

which it *then* interprets escapes in. As test\xfd ('0x74 0x65 0x73 0x74
0xfd') isn't valid UTF-8, the server rejects it.

If my understanding is right then the trouble is that the
client_encoding setting doesn't affect string escapes in SQL queries.
The conversion of the query text from client to server encoding is done
before string escapes are processed.

In truth, that's how I'd expect it to happen. If I ask for the byte 0xfd
in a string, I don't want the server to decide that I must've meant
something else because I have a different client encoding. If I wanted
encoding conversion, I wouldn't have written it in an escape form, I'd
have written 'ý' not '\375'.

--
Craig Ringer


pgsql-sql by date:

Previous
From: Bryce Nesbitt
Date:
Subject: Why don't I get a LATIN1 encoding here with SET ENCODING?
Next
From: Bryce Nesbitt
Date:
Subject: Re: Why don't I get a LATIN1 encoding here with SET ENCODING?