7.3.2 incorrectly counts characters for unicode varchar field - Mailing list pgsql-bugs

From Matthew Cooper
Subject 7.3.2 incorrectly counts characters for unicode varchar field
Date
Msg-id 002601c37966$2b9fe4b0$6600030a@gateway01
Whole thread Raw
Responses Re: 7.3.2 incorrectly counts characters for unicode varchar field  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================

Your name : Matthew Cooper
Your email address : matty (at) cloverworxs (dot) com

System Configuration
---------------------
Architecture (example: Intel Pentium) : Intel Pentium
Operating System (example: Linux 2.0.26 ELF) : Redhat 8.0 / 9.0
PostgreSQL version (example: PostgreSQL-7.2.2): PostgreSQL-7.2.2 / 7.3.2
Compiler used (example: gcc 2.95.2) : none

Please enter a FULL description of your problem:
------------------------------------------------
I have a database with UNICODE encoding set. In it is a table with a
varchar(10) column. If I insert 10 western characters into it, it is OK. If
I insert 10 chinese characters it says:
postgresql value too long for type character varying 10
when using 7.3.2. If I use 7.2.2 it works fine.
Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------
createdb -E UNICODE mydb
Then in psql...
create table mgc (c1 varchar(10));
insert into mgc values('0123456789');
This all works fine.
Now I put the following command into a file (say my.sql) which is UTF-8
encoded and the literal is 10 chinese characters.
(I don't know if once emailed this command will be readable so you may have
to re-create the command by pasting 10 chinese characters into your
favourite UTF-8 compatible editor.)
insert into mgc values ('åˆ†é’Ÿç»ƒä¹ åˆ†é’Ÿç»ƒä¹ ç»ƒä¹ ');
I then run psql -f my.sql and get the error for 7.3.2 but it works for
7.2.2.

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
I am guessing it is incorrectly counting the bytes and not the characters.
Presumably a workaround is to double the length of the field.

pgsql-bugs by date:

Previous
From: "Amar BAJRACHARYA"
Date:
Subject: Error during Installation
Next
From: "Loeke"
Date:
Subject: Postgres Pam Ldap (Debian)