Re: Duplicate JSON Object Keys - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Duplicate JSON Object Keys
Date
Msg-id 513A566A.5090909@dunslane.net
Whole thread Raw
In response to Re: Duplicate JSON Object Keys  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Duplicate JSON Object Keys  ("David E. Wheeler" <david@justatheory.com>)
List pgsql-hackers
On 03/08/2013 04:01 PM, Alvaro Herrera wrote:
> Hannu Krosing escribió:
>> On 03/08/2013 09:39 PM, Robert Haas wrote:
>>> On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler <david@justatheory.com> wrote:
>>>> In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON
objectkey uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the
latestkey wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it
morememory-intensive, but such is the price for correctness. 
>>> I'm with Andrew.  That's a rathole I emphatically don't want to go
>>> down.  I wrote this code originally, and I had the thought clearly in
>>> mind that I wanted to accept JSON that was syntactically well-formed,
>>> not JSON that met certain semantic constraints.
>> If it does not meet these "semantic" constraints, then it is not
>> really JSON - it is merely JSON-like.
>>
>> this sounds very much like MySQLs decision to support timestamp
>> "0000-00-00 00:00" - syntactically correct, but semantically wrong.
> Is it wrong?  The standard cited says SHOULD, not MUST.


Here's what rfc2119 says about that wording:
   4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that   there may exist valid reasons in particular
circumstanceswhen the   particular behavior is acceptable or even useful, but the full   implications should be
understoodand the case carefully weighed   before implementing any behavior described with this label. 


So we're allowed to do as Robert chose, and I think there are good
reasons for doing so (apart from anything else, checking it would slow
down the parser enormously).

Now you could argue that in that case the extractor functions should
allow it too, and it's probably fairly easy to change them to allow it.
In that case we need to decide who wins. We could treat a later field
lexically as overriding an earlier field of the same name, which I think
is what David expected. That's what plv8 does (i.e. it's how v8
interprets JSON):
   andrew=# create or replace function jget(t json, fld text) returns   text language plv8 as ' return t[fld]; ';
CREATEFUNCTION   andrew=# select jget('{"f1":"x","f1":"y"}','f1');     jget   ------     y   (1 row) 


Or you could take the view I originally took that in view of the RFC
wording we should raise an error if this was found.

I can live with either view.

cheers

andrew



pgsql-hackers by date:

Previous
From: Gavin Flower
Date:
Subject: Re: Duplicate JSON Object Keys
Next
From: "David E. Wheeler"
Date:
Subject: Re: Duplicate JSON Object Keys