Re: Duplicate JSON Object Keys - Mailing list pgsql-hackers

From David E. Wheeler
Subject Re: Duplicate JSON Object Keys
Date
Msg-id 137A752C-5EB9-47CB-B989-4A41FE40CE44@justatheory.com
Whole thread Raw
In response to Re: Duplicate JSON Object Keys  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Duplicate JSON Object Keys  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Mar 8, 2013, at 1:21 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

> Here's what rfc2119 says about that wording:
>
>   4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that
>   there may exist valid reasons in particular circumstances when the
>   particular behavior is acceptable or even useful, but the full
>   implications should be understood and the case carefully weighed
>   before implementing any behavior described with this label.

I suspect this was allowed for the JavaScript behavior where multiple keys are allowed, but the last key in the list
wins.

> So we're allowed to do as Robert chose, and I think there are good reasons for doing so (apart from anything else,
checkingit would slow down the parser enormously). 

Yes, but the implications are going to start biting us on the ass now.

> Now you could argue that in that case the extractor functions should allow it too, and it's probably fairly easy to
changethem to allow it. In that case we need to decide who wins. We could treat a later field lexically as overriding
anearlier field of the same name, which I think is what David expected. That's what plv8 does (i.e. it's how v8
interpretsJSON): 
>
>   andrew=# create or replace function jget(t json, fld text) returns
>   text language plv8 as ' return t[fld]; ';
>   CREATE FUNCTION
>   andrew=# select jget('{"f1":"x","f1":"y"}','f1');
>     jget
>   ------
>     y
>   (1 row)
>
>
> Or you could take the view I originally took that in view of the RFC wording we should raise an error if this was
found.
>
> I can live with either view.

I’m on the fence. On the one hand, I like the plv8 behavior, which is nice for a dynamic language. On the other hand, I
don'tmuch care for it in my database, where I want data storage requirements to be quite strict. I hate the idea of
"0000-00-00"being allowed as a date, and am uncomfortable with allowing duplicate keys to be stored in the JSON data
type.

So my order of preference for the options would be:

1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored

2. Throw an error when a JSON type has duplicate keys

3. Have the accessors find the last instance of a key and return that value

4. Let things remain as they are now

On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does
not.They contradict one another. 

Best,

David


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Duplicate JSON Object Keys
Next
From: Hannu Krosing
Date:
Subject: Re: Duplicate JSON Object Keys