Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON - Mailing list pgsql-hackers

From Joey Adams
Subject Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON
Date
Msg-id CAARyMpDVTkCH2m7mhjXwrrwpEY1EVX=80FT-+NKp8Yvf_Zzayg@mail.gmail.com
Whole thread Raw
In response to Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON  (Joey Adams <joeyadams3.14159@gmail.com>)
List pgsql-hackers
Also, should I forbid the escape \u0000 (in all database encodings)?

Pros:
* If \u0000 is forbidden, and the server encoding is UTF-8, then
every JSON-wrapped string will be convertible to TEXT.
* It will be consistent with the way PostgreSQL already handles text,
and with the decision to use database-encoded JSON strings.
* Some applications choke on strings with null characters.  For
example, in some web browsers but not others, if you pass
"Hello\u0000world" to document.write() or assign it to a DOM object's
innerHTML, it will be truncated to "Hello".  By banning \u0000, users
can catch such rogue strings early.
* It's a little easier to represent internally.

Cons:
* Means JSON type will accept a subset of the JSON described in
RFC4627.  However, the RFC does say "An implementation may set limits
on the length and character contents of strings", so we can arguably
get away with banning \u0000 while being law-abiding citizens.
* Being able to store U+0000–U+00FF means users can use JSON strings
to hold binary data: by treating it as Latin-1.

- Joey


pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: pgbench cpu overhead (was Re: lazy vxid locks, v1)
Next
From: Robert Haas
Date:
Subject: Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON