Re: jsonb, unicode escapes and escaped backslashes - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: jsonb, unicode escapes and escaped backslashes
Date
Msg-id CAHyXU0wmpKrP=uH_FTcky=JvPFHuYmiq+38JqnNRZPSUfQGGMg@mail.gmail.com
Whole thread Raw
In response to Re: jsonb, unicode escapes and escaped backslashes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: jsonb, unicode escapes and escaped backslashes
List pgsql-hackers
On Tue, Jan 27, 2015 at 12:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 01/27/2015 12:23 PM, Tom Lane wrote:
>>> I think coding anything is premature until we decide how we're going to
>>> deal with the fundamental ambiguity.
>
>> The input \\uabcd will be stored correctly as \uabcd, but this will in
>> turn be rendered as \uabcd, whereas it should be rendered as \\uabcd.
>> That's what the patch fixes.
>> There are two problems here and this addresses one of them. The other
>> problem is the ambiguity regarding \\u0000 and \u0000.
>
> It's the same problem really, and until we have an answer about
> what to do with \u0000, I think any patch is half-baked and possibly
> counterproductive.
>
> In particular, I would like to suggest that the current representation of
> \u0000 is fundamentally broken and that we have to change it, not try to
> band-aid around it.  This will mean an on-disk incompatibility for jsonb
> data containing U+0000, but hopefully there is very little of that out
> there yet.  If we can get a fix into 9.4.1, I think it's reasonable to
> consider such solutions.
>
> The most obvious way to store such data unambiguously is to just go ahead
> and store U+0000 as a NUL byte (\000).  The only problem with that is that
> then such a string cannot be considered to be a valid value of type TEXT,
> which would mean that we'd need to throw an error if we were asked to
> convert a JSON field containing such a character to text.

Hm, does this include text out operations for display purposes?   If
so, then any query selecting jsonb objects with null bytes would fail.
How come we have to error out?  How about a warning indicating the
string was truncated?

merlin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Shortcoming in CLOBBER_FREED_MEMORY coverage: disk buffer pointers
Next
From: Jim Nasby
Date:
Subject: Re: proposal: row_to_array function