Re: json accessors - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: json accessors
Date
Msg-id CAHyXU0wuZx-6LYsfieHmn68fBf+xycA9buTsr-MzJZi_6da-NA@mail.gmail.com
Whole thread Raw
In response to json accessors  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: json accessors
List pgsql-hackers
On Wed, Nov 28, 2012 at 11:04 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> This is a proposal to create some basic functions to extract values from
> json. The simple functions I envision would be:
>
>  * json_object_keys(json) => setof text
>    returns the set of dequoted, unescaped keys of the object,
>    errors if it's not an object
>  * json_get(json, keytext) => json
>    returns the json value corresponding to the key text in the json object,
>    null if not found, error if it's not an object
>  * json_get(json, indexint) => json
>    returns the json value of the indexth element in the json array,
>    null of the index is outside the array bounds, errors if it's not an
>    array
>  * json_get_as_text(json, keytext or indexint) => text
>    same as json_get() except that it returns dequoted, unescaped text
>    for a quoted leaf field

Comments (this is awesome btw):

*) ISTM your keytext operators are a reasonable replacement for a
hypothetical json_path.  That said  you're basically forcing json->sql
mapping through a highly iterative API, which I don't like. At the
very least, I think json_get should return setof json and return all
matching constructions.  I won't miss predicate tests: we can do all
that in SQL.

Non-trivial json productions in postgres require the creation of
special composite types that structure the data that we (I?) rig up in
SQL before routing to json.  What about having functions that work in
the opposite direction:

*) can you access both arrays and records with numeric positional
syntax (hopefully, yes?), for example:

x->0->0

*) json_object_keys(json) seems to special case to me. how about:

json_each(json) which returns a set of key/value pairs and would on
arrays or objects (for arrays the key could be invented from the
index).

*) json_get_as_text(json, keytext or indexint) => text

prefer json_to_text() naming. also json_to_hstore(), etc.

*) have you considered something like
anyelement from_json(anyelement, json)
or
select <json>::some_type;  (this may or many not be possible given our
casting mechanics; i don't know).

My reasoning here is that for non-trivial json productions we (I?)
typically use composite types to rigidly control the structure of the
output document.  For 'restful' type protocols I might want to use the
same trick: there would be a set of nested composite type/arrays (or
even, in trivial cases, a table) that would cleanly map to the
document.  The parsing here can and should be automatic; this would
give nice symmetry with your xxx_to_json functions.  Obviously
conversion here would be best effort but when it works, it would be
wonderful:

WITH json_data AS
( SELECT from_json(null::foo[], <input_doc>)
)
i1 as (INSERT INTO bar SELECT ... FROM json_data)
i2 as (INSERT INTO baz SELECT ... FROM json_data)

where "..." would be some combination of unnest() and composite type
access syntax.

Now, some documents in json won't translate cleanly to composite types
because json allows for heterogeneous arrays.  But if we're in control
of both sides of the protocol that shouldn't matter.

merlin



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pg_dump --split patch
Next
From: Tom Lane
Date:
Subject: Re: Bugs in CREATE/DROP INDEX CONCURRENTLY