Home > mailing lists

Re: additional json functionality - Mailing list pgsql-hackers

From	Merlin Moncure
Subject	Re: additional json functionality
Date	November 14, 2013 15:07:36
Msg-id	CAHyXU0yxSVoHtuBOessZY8aCSuafXrAieoPiY9wd6YKkFDCWNA@mail.gmail.com Whole thread Raw
In response to	Re: additional json functionality (Hannu Krosing <hannu@2ndQuadrant.com>)
Responses	Re: additional json functionality Re: additional json functionality
List	pgsql-hackers

Tree view

On Wed, Nov 13, 2013 at 6:01 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> On 11/14/2013 12:09 AM, Merlin Moncure wrote:
>> On Wed, Nov 13, 2013 at 4:16 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> On 11/13/2013 06:45 AM, Merlin Moncure wrote:> I'm not so sure we should
>>> require hstore to do things like build
>>>> Also, json_object is pretty weird to me, I'm not sure I see the
>>>> advantage of a new serialization format, and I don't agree with the
>>>> statement "but it is the caller's reponsibility to ensure that keys
>>>> are not repeated.".
>>> This is pretty standard in the programming languages I know of which use
>>> JSON.
>>>
>>>> I think the caller should have no such
>>>> responsibility.  Keys should be able to repeated.
>>> Apparently your experience with using JSON in practice has been fairly
>>> different from mine; the projects I work on, the JSON is being
>>> constantly converted back and forth to hashes and dictionaries, which
>>> means that ordering is not preserved and keys have to be unique (or
>>> become unique within one conversion cycle).  I think, based on the
>>> language of the RFC and common practice, that it's completely valid for
>>> us to require unique keys within JSON-manipulation routines.
>> Common practice?  The internet is littered with complaints about
>> documents being spontaneously re-ordered and or de-duplicated in
>> various stacks.  Other stacks provide mechanisms for explicit key
>> order handling (see here: http://docs.python.org/2/library/json.html).
>>   Why do you think they did that?
>>
>> I use pg/JSON all over the place.  In several cases I have to create
>> documents with ordered keys because the parser on the other side wants
>> them that way -- this is not a hypothetical argument.  The current
>> json serialization API handles that just fine and the hstore stuff
>> coming down the pike will not.
> I guess we should not replace current JSON type with hstore based
> one, but add something json-like based on nested hstore instead.
>
> Maybe call it jsdoc or jdoc or jsobj or somesuch.

This is exactly what needs to be done, full stop (how about: hstore).
It really comes down to this: changing the serialization behaviors
that have been in production for 2 releases (three if you count the
extension) is bad enough, but making impossible some legal json
constructions which are currently possible is an unacceptable
compatibility break.  It's going to break applications I've currently
put into production with no clear workaround.  This is quite frankly
not ok and and I'm calling foul.  The RFC may claim that these
constructions are dubious but that's irrelevant.  It's up to the
parser to decide that and when serializing you are not in control of
the parser.

Had the json type been stuffed into an extension, there would be a
clearer path to get to where you want to go since we could have walled
off the old functionality and introduced side by side API calls.  As
things stand now, I don't see a clean path to do that.

> I use pg/JSON all over the place.  In several cases I have to create
> documents with ordered keys because the parser on the other side wants
> them that way -- this is not a hypothetical argument.  The current
> json serialization API handles that just fine and the hstore stuff
> coming down the pike will not.  I guess that's a done deal based on
> 'performance'.  I'm clearly not the only one to have complained about
> this though.

It's not just a matter of "performance".  It's the basic conflict of
JSON as document format vs. JSON as data storage.  For the latter,
unique, unordered keys are required, or certain functionality isn't
remotely possible: indexing, in-place key update, transformations, etc.

On Wed, Nov 13, 2013 at 5:20 PM, Josh Berkus <josh@agliodbs.com> wrote:
> It's not just a matter of "performance".  It's the basic conflict of
> JSON as document format vs. JSON as data storage.  For the latter,
> unique, unordered keys are required, or certain functionality isn't
> remotely possible: indexing, in-place key update, transformations, etc.

That's not very convincing.  What *exactly* is impossible and why to
you think it justifies breaking compatibility with current
applications?   The way forward seems pretty straightforward: given
that hstore is getting nesting power and is moving closer to the json
way of doing things it is essentially 'binary mode json'.  I'm ok with
de-duplication and key ordering when moving into that structure since
it's opt in and doesn't break the serialization behaviors we have
today.  If you want to go further and unify the types then you have to
go through the design work to maintain compatibility.

Furthermore, I bet the performance argument isn't so clear cut either.The current json type is probably faster at bulk
serialization
precisely because you *dont* need to deduplicate and reorder keys: the
serialization operates without context.  It will certainly be much
better for in place manipulations but it's not nearly as simple as you
are making it out to be.

merlin

pgsql-hackers by date:

From: Tom Lane
Date: 14 November 2013, 14:52:14
Subject: Somebody broke \d on indexes

From: Andres Freund
Date: 14 November 2013, 15:10:21
Subject: Re: Somebody broke \d on indexes

Re: additional json functionality - Mailing list pgsql-hackers

Previous

Next