Re: latin1 unicode conversion errors - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: latin1 unicode conversion errors
Date
Msg-id 200602122114.k1CLEl721577@candle.pha.pa.us
Whole thread Raw
In response to latin1 unicode conversion errors  (Kris Jurka <books@ejurka.com>)
List pgsql-hackers
OK, yea, it is inconsistent. I changed it do throw a warning instead.
Only patched to 8.2 because it is a behavior change.

---------------------------------------------------------------------------

Kris Jurka wrote:
>
> Why is latin1 special in its conversion from unconvertible unicode data?
> Other latin character sets add a warning, but latin1 errors out.
>
> jurka=# create database utf8 with encoding ='utf8';
> CREATE DATABASE
> jurka=# \c utf8
> You are now connected to database "utf8".
> utf8=# create table t(a text);
> CREATE TABLE
> utf8=# insert into t values ('\346\231\243');
> INSERT 0 1
> utf8=# set client_encoding = 'latin2';
> SET
> utf8=# select * from t;
> WARNING:  ignoring unconvertible UTF-8 character 0xe699a3
>   a
> ---
>
> (1 row)
>
> utf8=# set client_encoding = 'latin1';
> SET
> utf8=# select * from t;
> ERROR:  could not convert UTF8 character 0x00e6 to ISO8859-1
>
> Kris Jurka
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/backend/utils/mb/conversion_procs/utf8_and_iso8859_1/utf8_and_iso8859_1.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/mb/conversion_procs/utf8_and_iso8859_1/utf8_and_iso8859_1.c,v
retrieving revision 1.13
diff -c -c -r1.13 utf8_and_iso8859_1.c
*** src/backend/utils/mb/conversion_procs/utf8_and_iso8859_1/utf8_and_iso8859_1.c    25 Dec 2005 02:14:18 -0000    1.13
--- src/backend/utils/mb/conversion_procs/utf8_and_iso8859_1/utf8_and_iso8859_1.c    12 Feb 2006 20:59:36 -0000
***************
*** 84,91 ****
              len -= 2;
          }
          else if ((c & 0xe0) == 0xe0)
!             elog(ERROR, "could not convert UTF8 character 0x%04x to ISO8859-1",
!                  c);
          else
          {
              *dest++ = c;
--- 84,93 ----
              len -= 2;
          }
          else if ((c & 0xe0) == 0xe0)
!             ereport(WARNING,
!                     (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
!                      errmsg("ignoring unconvertible UTF-8 character 0x%04x",
!                             c)));
          else
          {
              *dest++ = c;

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Krb5 & multiple DB connections
Next
From: "Mark Woodward"
Date:
Subject: Use cases