Another encoding issue - Mailing list pgsql-hackers

From Gavin Sherry
Subject Another encoding issue
Date
Msg-id Pine.LNX.4.58.0512091006160.12341@linuxworld.com.au
Whole thread Raw
Responses Re: Another encoding issue
Re: Another encoding issue
List pgsql-hackers
Hi all,

Here's another interesting encoding issue. I cannot recall having seen it
on the lists.

---
[swm@laptop build7]$ bin/createdb -E LATIN1 test
CREATE DATABASE
[swm@laptop build7]$ cat break.sh
dat=`echo -en "\245\241"`

echo "create table test (d text);"
echo "insert into test values('$dat');"
[swm@laptop build7]$ sh break.sh | bin/psql test
CREATE TABLE
INSERT 0 1
[swm@laptop build7]$ bin/createdb -T test test2
CREATE DATABASE
[swm@laptop build7]$ bin/createdb -T test -E UTF-8 test2
CREATE DATABASE
[swm@laptop build7]$ bin/pg_dump -C test2 > test2.dmp
[swm@laptop build7]$ bin/dropdb test2
DROP DATABASE
[swm@laptop build7]$ bin/psql template1 -f test2.dmp
SET
SET
SET
CREATE DATABASE
ALTER DATABASE
You are now connected to database "test2".
[...]
CREATE TABLE
ALTER TABLE
psql:test2.dmp:345: ERROR:  invalid UTF-8 byte sequence detected near byte
0xa5
CONTEXT:  COPY test, line 1, column d: "  "
[...]
---

Until createdb() is a lot more sophisticated, we cannot translate
characters between encodings. I don't think this is a huge issue though,
as most people are only going to be creating empty databases anyway.
Still, it probably requires documentation.

Thoughts?

Thanks,

Gavin


pgsql-hackers by date:

Previous
From: Gregory Maxwell
Date:
Subject: Re: Upcoming PG re-releases
Next
From: Stephan Szabo
Date:
Subject: Re: Foreign key trigger timing bug?