Re: jsonb, unicode escapes and escaped backslashes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: jsonb, unicode escapes and escaped backslashes
Date
Msg-id 3373.1422466618@sss.pgh.pa.us
Whole thread Raw
In response to Re: jsonb, unicode escapes and escaped backslashes  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: jsonb, unicode escapes and escaped backslashes
Re: jsonb, unicode escapes and escaped backslashes
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> It's not clear to me how we should represent a unicode null. i.e. given 
> a json of '["foo\u0000bar"]', I get that we'd store the element as 
> 'foo\x00bar', but what is the result of

>     (jsonb '["foo\u0000bar"')->>0

> It's defined to be text so we can't just shove a binary null in the 
> middle of it. Do we throw an error?

Yes, that is what I was proposing upthread.  Obviously, this needs some
thought to ensure that there's *something* useful you can do with a field
containing a nul, but we'd have little choice but to throw an error if
the user asks us to convert such a field to unescaped text.

I'd be a bit inclined to reject nuls in object field names even if we
allow them in field values, since just about everything you can usefully
do with a field name involves regarding it as text.

Another interesting implementation problem is what does indexing do with
such values --- ISTR there's an implicit conversion to C strings in there
too, at least in GIN indexes.

Anyway, there is a significant amount of work involved here, and there's
no way we're getting it done for 9.4.1, or probably 9.4.anything.  I think
our only realistic choice right now is to throw error for \u0000 so that
we can preserve our options for doing something useful with it later.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: jsonb, unicode escapes and escaped backslashes
Next
From: Petr Jelinek
Date:
Subject: Re: Sequence Access Method WIP