Re: json accessors - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: json accessors
Date
Msg-id CAHyXU0x3wNtcDZoX+Kb6UJ=fKtPO2=8DE4hFtZTtg2BJCjbkFQ@mail.gmail.com
Whole thread Raw
In response to Re: json accessors  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: json accessors  (Hannu Krosing <hannu@2ndQuadrant.com>)
Re: json accessors  ("David E. Wheeler" <david@justatheory.com>)
List pgsql-hackers
On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 11/28/2012 02:08 PM, Merlin Moncure wrote:
>> *) ISTM your keytext operators are a reasonable replacement for a
>> hypothetical json_path.  That said  you're basically forcing json->sql
>> mapping through a highly iterative API, which I don't like. At the
>> very least, I think json_get should return setof json and return all
>> matching constructions.  I won't miss predicate tests: we can do all
>> that in SQL.
>
>
> Yes, it's iterative. And for deeply nested json it might be somewhat
> inefficient, although the parser is pretty fast AFAICT. But it's a start.

not completely buying that: see comments below.  not supporting xpath
style decompositions seems wrong to me.  IOW, json_get should be set
returning (perhaps via wild cards in the keytext) or we need
json_each.

>> Non-trivial json productions in postgres require the creation of
>> special composite types that structure the data that we (I?) rig up in
>> SQL before routing to json.  What about having functions that work in
>> the opposite direction:
>>
>> *) can you access both arrays and records with numeric positional
>> syntax (hopefully, yes?), for example:
>>
>> x->0->0
>
>
> You can't do that in JS, so I'm not clear why we should allow it.

agreed -- withdrawn.

>> *) json_object_keys(json) seems to special case to me. how about:
>>
>> json_each(json) which returns a set of key/value pairs and would on
>> arrays or objects (for arrays the key could be invented from the
>> index).
>
> Again, I don't think we should conflate the processing for arrays and
> objects. But I could see doing each(json) => setof (text, json) (and maybe a
> similar function returning setof (text, text), which would dequote leaf
> nodes as json_get_as_text() does).
>
> And similarly a couple of functions to unnest arrays.

Yeah.  Although, I *do* think you need 'json_each' (or a set returning
json_get) and they should be conflated...exactly as jquery does:
http://api.jquery.com/jQuery.each/.  json objects are associative
arrays, right?

So if the *value* that gets returned by json_each is itself a
collection, we can cast back to json and recurse. at the very least,
we ought to decompose large documents into arbitrary smaller chunks
(as xpath does) without iterating.

In most of the code I'd write, I would decompose to a json object
using your stuff then route to something like:

insert into foo select (r).* from populate_record(null::foo,
json_to_hstore(x)) r
from json_each('path->to->record_containg_array', json_document');

assuming the json was deliberately constructed to mashall cleanly into
the database, which is perfectly reasonable.

>> *) json_get_as_text(json, keytext or indexint) => text
>>
>> prefer json_to_text() naming. also json_to_hstore(), etc.
>
> json_to_text seems rather misleading as a name here. Maybe we could remove
> the "_as" from the name if that's bothering you.

hm, I think you're right here -- I see the distinction.

> As for json_to_hstore, as I mentioned, the design is intended to enable the
> easy constructyion of such transformations, although for hstores anything
> except trivial json structure (i.e. an unnested object) it might have
> unappealing results. But in any case, the important thing to do first is to
> get the infrastructure in place. Time is very short and I don't want to
> extend this very much.

yeah, understood.

>> *) have you considered something like
>> anyelement from_json(anyelement, json)
>> or
>> select <json>::some_type;  (this may or many not be possible given our
>> casting mechanics; i don't know).
>
> I have no idea what the semantics of this would be.

Yeah, there's a lot of nuance there.   Don't have to tackle everything
at once I suppose, but spiritually I'm hoping it would serve as
replacement for textual record_in, array_in, etc.  It's just wrong to
have to specify each and every field in during parsing when the
receiving structure is well defined in the database.

merlin



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Bugs in CREATE/DROP INDEX CONCURRENTLY
Next
From: Tom Lane
Date:
Subject: Re: Bugs in CREATE/DROP INDEX CONCURRENTLY