Thread: GSoC 2015: Extra Jsonb functionality

GSoC 2015: Extra Jsonb functionality

From
Dmitry Dolgov
Date:
Hi, everyone

I'm Dmitry Dolgov, a phd student at the KemSU, Russia. I would like to submit a proposal to the GSoC about additional jsonb functionality, and I want to get any feedback and thougths about this.

Re: GSoC 2015: Extra Jsonb functionality

From
Dmitry Dolgov
Date:
Synopsis: Althrough Jsonb was introduced in PostgreSQL 9.4, there are several functions, that still missing. Partially this missing functionality was implemented in this extension [1] and the corresponding patch [2]. The purpose of this work is to implement the rest of functions accordingly to importance.

Benefits: New functionality, than can made the usage of the jsonb more convenient.

Deliverables: Implementation of the following functions (in the form of an extension
    * jsonb_delete_jsonb - delete key/value pairs based on the other jsonb.
      Example of usage:

        =# jsonb_delete_jsonb('{"a": 1, "b": {"c": 2, "d": 3}, "f": [4, 5]}'::jsonb, '{"a": 4, "f": [4, 5], "c": 2}'::jsonb);

                 jsonb_delete_jsonb
        ---------------------------------------
             {"a": 1, "b": {"c": 2, "d": 3}}

    * jsonb_slice - extract a subset of an jsonb
      Example of usage:

        =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb, ARRAY['b', 'f', 'x']);

               jsonb_slice
        ---------------------------
          {"b": {"c": 2}, "f": 3}

    * jsonb_to_array - get jsonb keys and values as an array
      Example of usage:

        =# jsonb_to_array('{"a": 1, "b": {"c": 2}, "d": [3, 4]}'::jsonb);

                jsonb_to_array
        ------------------------------
           {a, 1, b, c, 2, d, 3, 4}

    * jsonb_keys - get jsonb keys as an array
      Example of usage:

        =# jsonb_keys('{"a": 1, "b": {"c": 2}}'::jsonb);

            jsonb_keys
        -----------------
            {a, b, c}

    * jsonb_vals - get jsonb values as an array
      Example of usage:

        =# jsonb_vals('{"a": 1, "b": {"c": 2}, "d": [3, 4]}'::jsonb);

            jsonb_vals
        ------------------
           {1, 2, 3, 4}

    * jsonb_add_to_path - append a new element to jsonb value at the specific path
      Example of usage:

       =# jsonb_add_to_path('{"a": 1, "b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[], '["g"]'::jsonb);

                   jsonb_add_to_path
        -------------------------------------------
           {"a": 1, "b": {"c": ["d", "f", "g"]}}

    * jsonb_intersection - extract intersecting key/value pairs
      Example of usage:

        =# jsonb_intersection('{"a": 1, "b": 2, "d": {"f": 3}, "g": [4, 5]}'::jsonb, '{"b": 2, "c": 3, "f": 3, "g": [4, 5]}'::jsonb);
 
             jsonb_intersection
        ----------------------------
            {"b": 2, "g": [4, 5]}

Schedule: I suppose, this can take 2-3 months for me. First of all I'll implement the jsonb_delete_jsonb, jsonb_slice, jsonb_to_array, jsonb_keys, jsonb_vals functions (just because it almost clear how to implement them). Each function will require tests, and certainly some time will be spent at the finish on the improvements for extension as a whole.

Unfortunately, this proposal isn't submitted to the GSoC system yet (I'm planning to do this in the next Tuesday).


On 19 March 2015 at 20:16, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
Hi, everyone

I'm Dmitry Dolgov, a phd student at the KemSU, Russia. I would like to submit a proposal to the GSoC about additional jsonb functionality, and I want to get any feedback and thougths about this.


Re: GSoC 2015: Extra Jsonb functionality

From
Thom Brown
Date:
On 19 March 2015 at 13:23, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> Synopsis: Althrough Jsonb was introduced in PostgreSQL 9.4, there are
> several functions, that still missing. Partially this missing functionality
> was implemented in this extension [1] and the corresponding patch [2]. The
> purpose of this work is to implement the rest of functions accordingly to
> importance.
>
> Benefits: New functionality, than can made the usage of the jsonb more
> convenient.
>
> Deliverables: Implementation of the following functions (in the form of an
> extension
>     * jsonb_delete_jsonb - delete key/value pairs based on the other jsonb.
>       Example of usage:
>
>         =# jsonb_delete_jsonb('{"a": 1, "b": {"c": 2, "d": 3}, "f": [4,
> 5]}'::jsonb, '{"a": 4, "f": [4, 5], "c": 2}'::jsonb);
>
>                  jsonb_delete_jsonb
>         ---------------------------------------
>              {"a": 1, "b": {"c": 2, "d": 3}}

Perhaps it's my misunderstanding, but this would seem to be more of an
intersection operation on keys rather than a delete.

>     * jsonb_slice - extract a subset of an jsonb
>       Example of usage:
>
>         =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb,
> ARRAY['b', 'f', 'x']);
>
>                jsonb_slice
>         ---------------------------
>           {"b": {"c": 2}, "f": 3}
>
>     * jsonb_to_array - get jsonb keys and values as an array
>       Example of usage:
>
>         =# jsonb_to_array('{"a": 1, "b": {"c": 2}, "d": [3, 4]}'::jsonb);
>
>                 jsonb_to_array
>         ------------------------------
>            {a, 1, b, c, 2, d, 3, 4}

Is there a use-case for the example you've given above, where you take
JSON containing objects and arrays, and flatten them out into a
one-dimensional array?

>
>     * jsonb_keys - get jsonb keys as an array
>       Example of usage:
>
>         =# jsonb_keys('{"a": 1, "b": {"c": 2}}'::jsonb);
>
>             jsonb_keys
>         -----------------
>             {a, b, c}
>
>     * jsonb_vals - get jsonb values as an array
>       Example of usage:
>
>         =# jsonb_vals('{"a": 1, "b": {"c": 2}, "d": [3, 4]}'::jsonb);
>
>             jsonb_vals
>         ------------------
>            {1, 2, 3, 4}
>
>     * jsonb_add_to_path - append a new element to jsonb value at the
> specific path
>       Example of usage:
>
>        =# jsonb_add_to_path('{"a": 1, "b": {"c": ["d", "f"]}}'::jsonb, {b,
> c}::text[], '["g"]'::jsonb);
>
>                    jsonb_add_to_path
>         -------------------------------------------
>            {"a": 1, "b": {"c": ["d", "f", "g"]}}

What should happen if "g" or {"g"} were used instead?

>     * jsonb_intersection - extract intersecting key/value pairs
>       Example of usage:
>
>         =# jsonb_intersection('{"a": 1, "b": 2, "d": {"f": 3}, "g": [4,
> 5]}'::jsonb, '{"b": 2, "c": 3, "f": 3, "g": [4, 5]}'::jsonb);
>
>              jsonb_intersection
>         ----------------------------
>             {"b": 2, "g": [4, 5]}

Could there be a corresponding jsonb_except function which does the
opposite (i.e. returns everything on the left side except where it
matches with the right)?

Thanks.

-- 
Thom



Re: GSoC 2015: Extra Jsonb functionality

From
Alvaro Herrera
Date:
Dmitry Dolgov wrote:

>     * jsonb_slice - extract a subset of an jsonb
>       Example of usage:
> 
>         =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb,
> ARRAY['b', 'f', 'x']);
> 
>                jsonb_slice
>         ---------------------------
>           {"b": {"c": 2}, "f": 3}

This is a bit strange.  Why did "f" get flattened out of "d"?  Is the
resulting document still valid for the purposes of an application using
it?  I think I'd expect the result to be {"b": {"c": 2}, "d": {"f": 3}}

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: GSoC 2015: Extra Jsonb functionality

From
Thom Brown
Date:
On 19 March 2015 at 14:12, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Dmitry Dolgov wrote:
>
>>     * jsonb_slice - extract a subset of an jsonb
>>       Example of usage:
>>
>>         =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb,
>> ARRAY['b', 'f', 'x']);
>>
>>                jsonb_slice
>>         ---------------------------
>>           {"b": {"c": 2}, "f": 3}
>
> This is a bit strange.  Why did "f" get flattened out of "d"?  Is the
> resulting document still valid for the purposes of an application using
> it?  I think I'd expect the result to be {"b": {"c": 2}, "d": {"f": 3}}

Why would "d" be output when it wasn't in the requested slice?
Although I'm still a bit confused about "f" being produced.

-- 
Thom



Re: GSoC 2015: Extra Jsonb functionality

From
Alvaro Herrera
Date:
Thom Brown wrote:
> On 19 March 2015 at 14:12, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Dmitry Dolgov wrote:
> >
> >>     * jsonb_slice - extract a subset of an jsonb
> >>       Example of usage:
> >>
> >>         =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb,
> >> ARRAY['b', 'f', 'x']);
> >>
> >>                jsonb_slice
> >>         ---------------------------
> >>           {"b": {"c": 2}, "f": 3}
> >
> > This is a bit strange.  Why did "f" get flattened out of "d"?  Is the
> > resulting document still valid for the purposes of an application using
> > it?  I think I'd expect the result to be {"b": {"c": 2}, "d": {"f": 3}}
> 
> Why would "d" be output when it wasn't in the requested slice?

Because it contains "f".

> Although I'm still a bit confused about "f" being produced.

I guess you could say that the second argument is an array of element
paths, not key names.  So to get the result I suggest, you would have to
use ARRAY['{b}', '{d,f}', '{x}'].  (Hm, this is a non-rectangular
array actually... I guess I'd go for ARRAY['b', 'd//f', 'x'] instead, or
whatever the convention is to specify a json path).

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: GSoC 2015: Extra Jsonb functionality

From
Thom Brown
Date:
On 19 March 2015 at 14:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Thom Brown wrote:
>> On 19 March 2015 at 14:12, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> > Dmitry Dolgov wrote:
>> >
>> >>     * jsonb_slice - extract a subset of an jsonb
>> >>       Example of usage:
>> >>
>> >>         =# jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}}'::jsonb,
>> >> ARRAY['b', 'f', 'x']);
>> >>
>> >>                jsonb_slice
>> >>         ---------------------------
>> >>           {"b": {"c": 2}, "f": 3}
>> >
>> > This is a bit strange.  Why did "f" get flattened out of "d"?  Is the
>> > resulting document still valid for the purposes of an application using
>> > it?  I think I'd expect the result to be {"b": {"c": 2}, "d": {"f": 3}}
>>
>> Why would "d" be output when it wasn't in the requested slice?
>
> Because it contains "f".

Okay, so it pulls it all parents?  So I guess you'd get this too:

SELECT jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f":
4}'::jsonb, ARRAY['b', 'f', 'x']);
                 jsonb_slice
------------------------------------------------{"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f": 4}

>> Although I'm still a bit confused about "f" being produced.
>
> I guess you could say that the second argument is an array of element
> paths, not key names.  So to get the result I suggest, you would have to
> use ARRAY['{b}', '{d,f}', '{x}'].  (Hm, this is a non-rectangular
> array actually... I guess I'd go for ARRAY['b', 'd//f', 'x'] instead, or
> whatever the convention is to specify a json path).

I think that's where jsquery would come in handy.
-- 
Thom



Re: GSoC 2015: Extra Jsonb functionality

From
Alvaro Herrera
Date:
Thom Brown wrote:
> On 19 March 2015 at 14:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Thom Brown wrote:
> >> On 19 March 2015 at 14:12, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >> > Dmitry Dolgov wrote:
> >> >
> >> >>     * jsonb_slice - extract a subset of an jsonb
> >> >>       Example of usage:

> Okay, so it pulls it all parents?  So I guess you'd get this too:
> 
> SELECT jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f":
> 4}'::jsonb, ARRAY['b', 'f', 'x']);
> 
>                   jsonb_slice
> ------------------------------------------------
>  {"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f": 4}

Yeah, except "a" wouldn't be output, of course.  (The example gets more
interesting if "d" contains more members than just "f".  Those would not
get output.)


> >> Although I'm still a bit confused about "f" being produced.
> >
> > I guess you could say that the second argument is an array of element
> > paths, not key names.  So to get the result I suggest, you would have to
> > use ARRAY['{b}', '{d,f}', '{x}'].  (Hm, this is a non-rectangular
> > array actually... I guess I'd go for ARRAY['b', 'd//f', 'x'] instead, or
> > whatever the convention is to specify a json path).
> 
> I think that's where jsquery would come in handy.

If that's what we think, then perhaps we shouldn't accept jsonb_slice at
all because of ambiguous mode of operation.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: GSoC 2015: Extra Jsonb functionality

From
Dmitry Dolgov
Date:
> Perhaph it's my misunderstanding, but this would seem to be more of an intersection operation on keys rather than a delete.
Hm...why? We remove all elements, which are contains in the first and second jsonb ("f": [4, 5] in this case) from the first one.

> Could there be a corresponding jsonb_except function which does the opposite (i.e. returns everything on the left side except where it matches with the right)?
and if I understand your question correctly, this is exactly what the jsonb_delete_jsonb will do, isn't it?.

> Is there a use-case for the example you've given above, where you take JSON containing objects and arrays, and flatten them out into a one-dimensional array?
Hm...actually I don't know about such use-cases. This function is analog of the hstore_to_array (and the similar function hstore_to_matrix), which is used sometimes, judging by github. So I thought this function should be implemented (after this question I'm not so sure).

> What should happen if "g" or {"g"} were used instead?
Did you mean {"g": "key"}? Hmm...but in any case, I suppose this new object should be appended to the array as a regular element.
    =# jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[], '"g"'::jsonb);

            jsonb_add_to_path
    ---------------------------------------
       {"b": {"c": ["d", "f", "g"]}}


> This is a bit strange.  Why did "f" get flattened out of "d"?
The main purpose if this function is to get values for required keys from all nesting levels (actually, I thougth it will be not so convenient otherwise and I didn't consider the implementation with path usage). If this so confusing, I can remove this function from the list =)

On 20 March 2015 at 00:08, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Thom Brown wrote:
> On 19 March 2015 at 14:35, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Thom Brown wrote:
> >> On 19 March 2015 at 14:12, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >> > Dmitry Dolgov wrote:
> >> >
> >> >>     * jsonb_slice - extract a subset of an jsonb
> >> >>       Example of usage:

> Okay, so it pulls it all parents?  So I guess you'd get this too:
>
> SELECT jsonb_slice('{"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f":
> 4}'::jsonb, ARRAY['b', 'f', 'x']);
>
>                   jsonb_slice
> ------------------------------------------------
>  {"a": 1, "b": {"c": 2}, "d": {"f": 3}, "f": 4}

Yeah, except "a" wouldn't be output, of course.  (The example gets more
interesting if "d" contains more members than just "f".  Those would not
get output.)


> >> Although I'm still a bit confused about "f" being produced.
> >
> > I guess you could say that the second argument is an array of element
> > paths, not key names.  So to get the result I suggest, you would have to
> > use ARRAY['{b}', '{d,f}', '{x}'].  (Hm, this is a non-rectangular
> > array actually... I guess I'd go for ARRAY['b', 'd//f', 'x'] instead, or
> > whatever the convention is to specify a json path).
>
> I think that's where jsquery would come in handy.

If that's what we think, then perhaps we shouldn't accept jsonb_slice at
all because of ambiguous mode of operation.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: GSoC 2015: Extra Jsonb functionality

From
Thom Brown
Date:
On 20 March 2015 at 11:21, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>> Perhaph it's my misunderstanding, but this would seem to be more of an
>> intersection operation on keys rather than a delete.
> Hm...why? We remove all elements, which are contains in the first and second
> jsonb ("f": [4, 5] in this case) from the first one.

On further thought, yes, I agree.

>> Could there be a corresponding jsonb_except function which does the
>> opposite (i.e. returns everything on the left side except where it matches
>> with the right)?
> and if I understand your question correctly, this is exactly what the
> jsonb_delete_jsonb will do, isn't it?.

Ah, yes, that's true.

>> Is there a use-case for the example you've given above, where you take
>> JSON containing objects and arrays, and flatten them out into a
>> one-dimensional array?
> Hm...actually I don't know about such use-cases. This function is analog of
> the hstore_to_array (and the similar function hstore_to_matrix), which is
> used sometimes, judging by github. So I thought this function should be
> implemented (after this question I'm not so sure).

Yeah, hstore was just key=>value, so flattening it out resulted in a
simple {key,value,key,value} array.  I don't think that's useful with
json.

>> What should happen if "g" or {"g"} were used instead?
> Did you mean {"g": "key"}? Hmm...but in any case, I suppose this new object
> should be appended to the array as a regular element.
>     =# jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[],
> '"g"'::jsonb);
>
>             jsonb_add_to_path
>     ---------------------------------------
>        {"b": {"c": ["d", "f", "g"]}}
>

Would this also be the case for this function?...

# jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[],
'{"g":4}'::jsonb);        jsonb_add_to_path
------------------------------------{"b": {"c": ["d", "f", {"g": 4}]}}

-- 
Thom



Re: GSoC 2015: Extra Jsonb functionality

From
Dmitry Dolgov
Date:
> Would this also be the case for this function?...
> # jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[],
> '{"g":4}'::jsonb);
>         jsonb_add_to_path
> ------------------------------------
> {"b": {"c": ["d", "f", {"g": 4}]}}

Yes, sure (the similar logic already implemented  for the jsonb_concat).

On 20 March 2015 at 18:39, Thom Brown <thom@linux.com> wrote:
On 20 March 2015 at 11:21, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>> Perhaph it's my misunderstanding, but this would seem to be more of an
>> intersection operation on keys rather than a delete.
> Hm...why? We remove all elements, which are contains in the first and second
> jsonb ("f": [4, 5] in this case) from the first one.

On further thought, yes, I agree.

>> Could there be a corresponding jsonb_except function which does the
>> opposite (i.e. returns everything on the left side except where it matches
>> with the right)?
> and if I understand your question correctly, this is exactly what the
> jsonb_delete_jsonb will do, isn't it?.

Ah, yes, that's true.

>> Is there a use-case for the example you've given above, where you take
>> JSON containing objects and arrays, and flatten them out into a
>> one-dimensional array?
> Hm...actually I don't know about such use-cases. This function is analog of
> the hstore_to_array (and the similar function hstore_to_matrix), which is
> used sometimes, judging by github. So I thought this function should be
> implemented (after this question I'm not so sure).

Yeah, hstore was just key=>value, so flattening it out resulted in a
simple {key,value,key,value} array.  I don't think that's useful with
json.

>> What should happen if "g" or {"g"} were used instead?
> Did you mean {"g": "key"}? Hmm...but in any case, I suppose this new object
> should be appended to the array as a regular element.
>     =# jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[],
> '"g"'::jsonb);
>
>             jsonb_add_to_path
>     ---------------------------------------
>        {"b": {"c": ["d", "f", "g"]}}
>

Would this also be the case for this function?...

# jsonb_add_to_path('{"b": {"c": ["d", "f"]}}'::jsonb, {b, c}::text[],
'{"g":4}'::jsonb);
         jsonb_add_to_path
------------------------------------
 {"b": {"c": ["d", "f", {"g": 4}]}}

--
Thom

Re: GSoC 2015: Extra Jsonb functionality

From
Jim Nasby
Date:
On 3/19/15 9:07 AM, Thom Brown wrote:
>> >                 jsonb_to_array
>> >         ------------------------------
>> >            {a, 1, b, c, 2, d, 3, 4}
> Is there a use-case for the example you've given above, where you take
> JSON containing objects and arrays, and flatten them out into a
> one-dimensional array?

There are a lot of things proposed here that are completely ignoring the 
idea of nested elements, which I think is a big mistake.

Frankly, I think the whole proposal needs to be rethought with an eye 
towards supporting and preserving nested elements instead of trying to 
just flatten everything out. If a user wanted things flat they would 
have just started with that in the first place.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: GSoC 2015: Extra Jsonb functionality

From
Dmitry Dolgov
Date:
Frankly, I think the whole proposal needs to be rethought with an eye towards supporting and preserving nested elements instead of trying to just flatten everything out.

Can you pls show me few examples what do you mean exactly?

On 21 March 2015 at 06:51, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 3/19/15 9:07 AM, Thom Brown wrote:
>                 jsonb_to_array
>         ------------------------------
>            {a, 1, b, c, 2, d, 3, 4}
Is there a use-case for the example you've given above, where you take
JSON containing objects and arrays, and flatten them out into a
one-dimensional array?

There are a lot of things proposed here that are completely ignoring the idea of nested elements, which I think is a big mistake.

Frankly, I think the whole proposal needs to be rethought with an eye towards supporting and preserving nested elements instead of trying to just flatten everything out. If a user wanted things flat they would have just started with that in the first place.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: GSoC 2015: Extra Jsonb functionality

From
Jim Nasby
Date:
On 3/21/15 12:49 PM, Dmitry Dolgov wrote:
>  > Frankly, I think the whole proposal needs to be rethought with an eye
> towards supporting and preserving nested elements instead of trying to
> just flatten everything out.
>
> Can you pls show me few examples what do you mean exactly?

All the comments others have made in this thread, for starters.

Your proposal seems intent on completely flattening all JSON documents, 
or treating them as if they were flat. I see zero chance of that being 
accepted.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com