Re: JSON for PG 9.2 - Mailing list pgsql-hackers

From Joey Adams
Subject Re: JSON for PG 9.2
Date
Msg-id CAARyMpAW9N+6_kcob-R=DMP3HyvXXduA3kAecsocLPNZq04CyQ@mail.gmail.com
Whole thread Raw
In response to Re: JSON for PG 9.2  (Abhijit Menon-Sen <ams@toroid.org>)
Responses Re: JSON for PG 9.2  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Jan 31, 2012 at 1:29 PM, Abhijit Menon-Sen <ams@toroid.org> wrote:
> At 2012-01-31 12:04:31 -0500, robertmhaas@gmail.com wrote:
>>
>> That fails to answer the question of what we ought to do if we get an
>> invalid sequence there.
>
> I think it's best to categorically reject invalid surrogates as early as
> possible, considering the number of bugs that are related to them (not
> in Postgres, just in general). I can't see anything good coming from
> letting them in and leaving them to surprise someone in future.
>
> -- ams

+1

Another sequence to beware of is \u0000.  While escaped NUL characters
are perfectly valid in JSON, NUL characters aren't allowed in TEXT
values.  This means not all JSON strings can be converted to TEXT,
even in UTF-8.  This may also complicate collation, if comparison
functions demand null-terminated strings.

I'm mostly in favor of allowing \u0000.  Banning \u0000 means users
can't use JSON strings to marshal binary blobs, e.g. by escaping
non-printable characters and only using U+0000..U+00FF.  Instead, they
have to use base64 or similar.

Banning \u0000 doesn't quite violate the RFC:
   An implementation may set limits on the length and character   contents of strings.

-Joey


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [GENERAL] pg_dump -s dumps data?!
Next
From: Tom Lane
Date:
Subject: Re: [v9.2] Add GUC sepgsql.client_label