On Sat, 13 Nov 2021 at 21:52, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 11/13/21 00:40, Thomas Munro wrote:
>> On Sat, Nov 13, 2021 at 4:32 PM Japin Li <japinli@hotmail.com> wrote:
>>> When I try to insert an Unicode "\u0000", there is an error $subject.
>>>
>>> postgres=# CREATE TABLE tbl (s varchar(10));
>>> CREATE TABLE
>>> postgres=# INSERT INTO tbl VALUES (E'\u0000');
>>> ERROR: invalid Unicode escape value at or near "\u0000"
>>> LINE 1: INSERT INTO tbl VALUES (E'\u0000');
>>> ^
>>>
>>> "\u0000" is valid unicode [1], why not we cannot insert it?
>> Yes, it is a valid codepoint, but unfortunately PostgreSQL can't
>> support it because it sometimes deals in null terminated string, even
>> though internally it does track string data and length separately. We
>> have to do that to use libc facilities like strcoll_r(), and probably
>> many other things.
>>
>>
>
> And it's documented at
> <https://www.postgresql.org/docs/current/datatype-character.html>:
>
>
> The characters that can be stored in any of these data types are
> determined by the database character set, which is selected when the
> database is created. Regardless of the specific character set, the
> character with code zero (sometimes called NUL) cannot be stored.
>
>
Thanks Thomas and Andrew. Sorry for my ignore reading of the documentation.
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.