Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP) - Mailing list pgsql-hackers

From Joseph Adams
Subject Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
Date
Msg-id AANLkTikD2VTaZ1QvdOZQdD5iR-6fHTksvsi=g_6a8eS4@mail.gmail.com
Whole thread Raw
In response to Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sat, Sep 18, 2010 at 4:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Hmm, yeah.  I'd be tempted to try to keep the user's original
> whitespace as far as possible, but disregard it as far as equality
> comparison goes.  However, I'm not quite sure what the right thing to
> do about 0 vs 0.0 is.  Does the JSON spec say anything about that?

I didn't find anything in the JSON spec about comparison, but in
JavaScript, 0 == 0.0 and 0 === 0.0 are both true.  Also, JavaScript
considers two arrays or objects equal if and only if they are
references to the same object, meaning [1,2,3] == [1,2,3] is false,
but if you let var a = [1,2,3]; var b = a; , then a == b and a === b
are both true.  Hence, JavaScript can help us when deciding how to
compare scalars, but as for arrays and objects, "we're on our own"
(actually, JSON arrays could be compared lexically like PostgreSQL
arrays already are; I don't think anyone would disagree with that).

I cast my vote for 0 == 0.0 being true.

As for whitespace preservation, I don't think we should go out of our
way to keep it intact.  Sure, preserving formatting for input and
output makes some sense because we'd have to go out of our way to
normalize it, but preserving whitespace in JSONPath tree selections
(json_get) and updates (json_set) is a lot of work (that I shouldn't
have done), and it doesn't really help anybody.  Consider json_get on
a node under 5 levels of indentation.

Another thing to think about is the possibility of using a non-text
format in the future (such as a binary format or even a format that is
internally indexed).  A binary format would likely be faster to
compare (and hence faster to index).  If the JSON data type preserves
whitespace today, it might break assumptions of future code when it
stops preserving whitespace.  This should at least be documented.


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: psql's \dn versus temp schemas
Next
From: Robert Haas
Date:
Subject: Re: patch: Add JSON datatype to PostgreSQL (GSoC, WIP)