Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8" - Mailing list pgsql-bugs

From Zhongpu Chen
Subject Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
Date
Msg-id CA+1gyqJMtuTofZDy+CeomGGhsFGXw6JrdyAhqvnLii44oKePGg@mail.gmail.com
Whole thread
In response to Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"  (Zhongpu Chen <chenloveit@gmail.com>)
Responses Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
List pgsql-bugs

```
demo_euc_cn_db=# SET client_encoding TO 'EUC_CN';
SET
demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
 id | s
----+----
  1 | ��
(1 row)
```

Since 0xA2A3 is invalid in EUC-CN, it cannot be mapped to any meaningful character. Currently, EUC-CN allows all 2-byte within A1-EF, but this coarse-grained approach is flawed.

On Fri, May 1, 2026 at 11:07 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
On Fri, May 1, 2026 at 9:59 PM Zhongpu Chen <chenloveit@gmail.com> wrote:
>
> ## Description
>
> The legacy encodings allow some invalid bytes, which will cause errors during SELECT operations.
>
> ## How to reproduce
>
> ```shell
> createdb -E EUC_CN -T template0 --locale=C demo_euc_cn_db
> ```
>
> ```sql
> demo_euc_cn_db=# CREATE TABLE t(id int, s varchar(10));
>
> demo_euc_cn_db=# INSERT INTO t VALUES(1, E'\xA2\xA3');
> INSERT 0 1
> demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
> ERROR:  character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"

Can you try the following statement before select?
SET client_encoding TO 'EUC_CN';

> ```
>
> --
> Zhongpu Chen



--
Regards
Junwang Zhao


--
Zhongpu Chen

pgsql-bugs by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
Next
From: Andres Freund
Date:
Subject: Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum