Re: Duplicate JSON Object Keys - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Duplicate JSON Object Keys
Date
Msg-id 513A7636.8060608@krosing.net
Whole thread Raw
In response to Re: Duplicate JSON Object Keys  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Duplicate JSON Object Keys  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On 03/08/2013 11:03 PM, Andrew Dunstan wrote:
>
> On 03/08/2013 04:42 PM, Andrew Dunstan wrote:
>>
>>>
>>> So my order of preference for the options would be:
>>>
>>> 1. Have the JSON type collapse objects so the last instance of a key 
>>> wins and is actually stored
>>>
>>> 2. Throw an error when a JSON type has duplicate keys
>>>
>>> 3. Have the accessors find the last instance of a key and return 
>>> that value
>>>
>>> 4. Let things remain as they are now
>>>
>>> On second though, I don't like 4 at all. It means that the JSON type 
>>> things a value is valid while the accessor does not. They contradict 
>>> one another.
>>>
>>>
>>
>>
>> You can forget 1. We are not going to have the parser collapse 
>> anything. Either the JSON it gets is valid or it's not. But the 
>> parser isn't going to try to MAKE it valid.
>
>
> Actually, now I think more about it 3 is the best answer.
> Here's why: even the JSON generators can produce JSON with non-unique 
> field names:
Yes, especially if you consider popular json generators vim and strcat() :)

It is not a "serialisation" of some existing object, but it is something
that JavaScript could interpret as valid subset of JavaScript which
producees a JavaScript Object when interpreted.
In this sense it is way better than MySQL timestamp 0000-00-00 00:00

So the loose (without implementing the SHOULD part) meaning of
JSON spec is "anything that can be read  into JavaScript producing
a JS Object" and not "serialisation of a JavaScript Object" as I wanted
to read it initially.

>
>    andrew=# select row_to_json(q) from (select x as a, y as a from
>    generate_series(1,2) x, generate_series(3,4) y) q;
>       row_to_json
>    ---------------
>      {"a":1,"a":3}
>      {"a":1,"a":4}
>      {"a":2,"a":3}
>      {"a":2,"a":4}
>
>
> So I think we have no option but to say, in terms of rfc 2119, that we 
> have careful considered and decided not to comply with the RFC's 
> recommendation
The downside is, that the we have just shifted the burden of JS Object 
generation to the getter functions.

I suspect that 99.98% of the time we will get valid and unique JS Object 
serializations or equivalent as input to json_in()

If we want the getter functions to handle the "loose JSON" to Object 
conversion
side  assuming our stored JSON can contain non-unique keys then these are
bound to be slower, as they have to do these checks. Thay can't just 
grab the first
matching one and return or recurse on that.

> (and we should note that in the docs).
definitely +1
>
> cheers
>
> andrew
>
>
>
>




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: scanner/parser minimization
Next
From: Andrew Dunstan
Date:
Subject: Re: Duplicate JSON Object Keys