Re: ERROR: unsupported Unicode escape sequence - in JSON-type column - Mailing list pgsql-general

From Jan Bilek
Subject Re: ERROR: unsupported Unicode escape sequence - in JSON-type column
Date
Msg-id 63ae2d9c-fc24-1de8-63e2-10bb0b6c6e5a@eftlab.com.au
Whole thread Raw
In response to Re: ERROR: unsupported Unicode escape sequence - in JSON-type column  (Erik Wienhold <ewie@ewie.name>)
List pgsql-general
On 2/28/23 01:17, Erik Wienhold wrote:
>> On 27/02/2023 13:13 CET Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> I'd be curious to know how the customer managed to do that.
>> Perhaps there is a loophole in PostgreSQL that needs to be fixed.
> Probably via some data access layer and not directly via Postgres.  It's easy
> to reproduce with psycopg:
>
>     import psycopg
>
>     with psycopg.connect() as con:
>         con.execute('create temp table jsontab (jsoncol json)')
>         con.execute(
>             'insert into jsontab (jsoncol) values (%s)',
>             [psycopg.types.json.Json('\0')],
>         )
>
>         with con.execute('select jsoncol from jsontab') as cur:
>             print(cur.fetchall())
>
>         try:
>             with con.execute('select jsoncol::jsonb from jsontab') as cur:
>                 pass
>             raise AssertionError("jsonb should fail")
>         except psycopg.errors.UntranslatableCharacter:
>             pass
>
> Another reason to prefer jsonb over json to reject such inputs right away.
> The documentation states that json does not validate inputs in constrast to
> jsonb.
>
> Of course the OP now has to deal with json.  The data can be sanitized by
> replacing all null character escape sequences:
>
>     update jsontab
>     set jsoncol = replace(jsoncol::text, '\u0000', '')::json
>     where strpos(jsoncol::text, '\u0000') > 0;
>
> But the data access layer (or whatever got the json into the database) must be
> fixed as well to reject or sanitize those inputs in the future.
>
> --
> Erik

Hi Erik,

No, it didn't go through any foreign data access layer - it went in 
straight through the Postgresql variable bind using pre-cached insert 
statement using PostgreSQL 14.5, connected over UNIX sockets.

Strange thing happened afterwards - that locating that record was on & 
off - I couldn't pin-point it in DB as it seemed to be failing on 
multiple places ... until using that trick from Laurenz. Felt like a 
PostgreSQL memory corruption, but system remained stable without any 
complaints.

Thanks & Cheers,
Jan

-- 
Jan Bilek - CTO at EFTlab Pty Ltd.


pgsql-general by date:

Previous
From: Jan Bilek
Date:
Subject: Re: ERROR: unsupported Unicode escape sequence - in JSON-type column
Next
From: Thorsten Glaser
Date:
Subject: Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)