Thread: JSON data type status?

JSON data type status?

From
Bruce Momjian
Date:
What happened to our work to add a JSON data type for PG 9.1?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: JSON data type status?

From
Itagaki Takahiro
Date:
On Fri, Jan 21, 2011 at 09:11, Bruce Momjian <bruce@momjian.us> wrote:
> What happened to our work to add a JSON data type for PG 9.1?

Nothing will happen in 9.1.
I assume we are in "competition" status:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php

Also, if PGXN will work well, we might not have to include JSON
in the core. We can download any JSON implementations from the
site after installing the core server.  Of course, if we will
use JSON types in the core (EXPLAIN JSON output?), we have to
include one of them.

-- 
Itagaki Takahiro


Re: JSON data type status?

From
Bruce Momjian
Date:
Itagaki Takahiro wrote:
> On Fri, Jan 21, 2011 at 09:11, Bruce Momjian <bruce@momjian.us> wrote:
> > What happened to our work to add a JSON data type for PG 9.1?
> 
> Nothing will happen in 9.1.
> I assume we are in "competition" status:
> 
> http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php
> 
> Also, if PGXN will work well, we might not have to include JSON
> in the core. We can download any JSON implementations from the
> site after installing the core server.  Of course, if we will
> use JSON types in the core (EXPLAIN JSON output?), we have to
> include one of them.

Thank you.  I have added that URL to our TODO list.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: JSON data type status?

From
Joseph Adams
Date:
On Thu, Jan 20, 2011 at 8:13 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Fri, Jan 21, 2011 at 09:11, Bruce Momjian <bruce@momjian.us> wrote:
>> What happened to our work to add a JSON data type for PG 9.1?
>
> Nothing will happen in 9.1.
> I assume we are in "competition" status:
>
> http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php
>
> Also, if PGXN will work well, we might not have to include JSON
> in the core. We can download any JSON implementations from the
> site after installing the core server.  Of course, if we will
> use JSON types in the core (EXPLAIN JSON output?), we have to
> include one of them.

Thanks for the update.  However, I should probably take some blame as
well for spending basically zero time on the JSON datatype for lengths
at a time.

After someone emailed me asking how to install the JSON datatype, I
fixed up my module version of it:

http://git.postgresql.org/gitweb?p=json-datatype.git

I also have a version where JSON is a core datatype (the patches I
submitted to CommitFest), but it is obsolete now.  The module version
has all the same features, but also fixes a Unicode bug* and adds
PostgreSQL 8.4 compatibility.  For those who want the JSON datatype
right now, I recommend using the module.

I plan to maintain and improve the JSON module (repository linked
above), albeit at a snail's pace.  Patches are certainly welcome.  See
roadmap.markdown in the repository for some minor and not-so-minor
design decisions.


Joey Adams

* P.S.

The "Unicode bug" involved UTF-16 surrogate pair calculation.  JSON
encodes Unicode characters that aren't in the Basic Multilingual Plane
the same way UTF-16 does: using two 4-digit hex codes.  The correct
formula for determining the Unicode codepoint of a surrogate pair is:
   unicode = 0x10000 + (((uc & 0x3FF) << 10) | (lc & 0x3FF));

where:
   0xD800 <= uc <= 0xDBFF   0xDC00 <= lc <= 0xDFFF

For example, consider the JSON string "\uD835\uDD0D":
   uc      = 0xD835   lc      = 0xDD0D   unicode = 0x1D50D

This pair of hex indices collapses into one Unicode codepoint, 0x1D50D
(or "𝔍").

I originally used this:
   /* WRONG */   unicode = 0x10000 | ((uc & 0x3FF) << 10) | (lc & 0x3FF);

The problem is, ((uc & 0x3FF) << 10) | (lc & 0x3FF) has a range of 0
to 0xFFFFF, and in some of those numbers, `0x10000 |` adds zero.

I am sharing all this because I have made this mistake twice, and I
believe it is an easy mistake to make.