Thread: Further issues with jsonb semantics, documentation

Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

04 June 2015, 02:02:58

I've noticed some more issues with the jsonb documentation, and the
new jsonb stuff generally. I didn't set out to give Andrew feedback on
the semantics weeks after feature freeze, but unfortunately this feels
like another discussion that we need to have now rather than later.

"operator jsonb - integer"
===================

Summary: I think that this operator has a problem, but a problem that
can easily be fixed.


I think it was a bad idea to allow array-style removal of object
key/value pairs. ISTM that it implies a level of stability in the
ordering that doesn't make sense. Besides, is it really all that
useful?

Consider this case:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
     ?column?
------------------
 {"a": 6, "c": 5}
(1 row)

Clearly anyone expecting the value "a" to be removed here would be in
for a surprise. Moreover, it is inconsistent with the established
behavior of the corresponding array-wise subscript operator:

postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
 ?column?
----------
 [null]
(1 row)

I suggest, given that this is conceptually a data-modifying operator,
that the minus operator/jsonb_delete() case raise an error rather than
matching "operator jsonb -> integer" and returning NULL. I say this as
the person who successfully argued that the -> operator case above
should return NULL during the 9.4 beta period; returning SQL NULL for
the delete/minus operator feels like going too far in the direction of
permissiveness, even for jsonb; my expression index argument does not
apply here as it did for the "operator jsonb -> integer" case.

"operator jsonb - text"
================

Summary: I think that this operator is fine.


Documentation needs work, though. The "operator jsonb - text" operator
ought to be documented as in the attached patch, which is closer to
the equivalent hstore operator, and emphasizes the "operator jsonb ?
text" definition of a key. It should emphasize its similarity to the
established "operator jsonb ? text" operator, and in particular that
array elements behave as keys *iff* they're strings.

"operator jsonb - text[]" (and *nested* deletion more generally)
===============================================

Summary: I think that this operator has many problems, and should be
scraped (although only as an operator). IMV nested deletion should
only be handled by functions, and the way that nested deletion works
in general should be slightly adjusted.


The new "operator jsonb - text[]" operator is confusingly inconsistent with:

A) "operator jsonb text"

and:

B) the established "operator hstore - text[]" operator, since that
operator deletes all key/value pairs that have keys that match any of
the right operand text array values. In contrast, this new operator is
passed as its right operand an array of text elements that constitute
a "path" (so the order in the rhs text[] operand matters). If the text
element in the rhs text[] operand happens to be what would pass for a
Postgres integer literal, it can be used to traverse lhs array values
through subscripting at that nesting level.

Regarding nested deletion behavior more generally, consider this
example of how this can work out badly:

postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}')  ;
 jsonb_delete
--------------
 ["a", "b"]
(1 row)

Here, we're adding and then deleting an array element at offset 5 (the
string "b"). But the element is never deleted by the outer
jsonb_delete(), because we can't rely on the element actually being
stored at offset 5. Seems a bit fragile.

More importantly, consider the inconsistency with "operator jsonb
text" ("point A" above):

postgres=# select '["a"]'::jsonb  ?| '{a}'::text[]; -- historic/9.4 behavior
 ?column?
----------
 t
(1 row)

postgres=# select '["a"]'::jsonb  - '{a}'::text[]; -- new to 9.5
operator, does not delete!
 ?column?
----------
 ["a"]
(1 row)

Perhaps most questionably of all, the non-array based minus/delete
operator (which I like) *does* have the same idea of matching a key as
the established "operator jsonb ?| text[]" operator (and "operator
jsonb ? text", etc):

postgres=# select '["a"]'::jsonb  - 'a'::text; -- new to 9.5 operator,
*does* delete!
 ?column?
----------
 []
(1 row)

This conceptual model for manipulating jsonb is entirely new and novel
to this new operator "operator text[]" (and jsonb_set()).

"operator jsonb - text[]" categorization/conceptual model
==========================================

Operators like the established "operator jsonb -> integer" operator (a
jsonb "array-wise" operator) always seemed okay to me because the rhs
operand really was a Postgres integer, and because it's explicitly an
array-wise operator (just like "operator -> text" is explicitly
object-wise). But now, with these new operators, you've added a
"shadow type" system to certain rhs text[] operands, consisting of
types not explicitly delineated by JSON-style double quotes (for
strings, say). So there is kind of a second shadow type system in
play, similar to that of jsonb except that text[] "shadow types"
cannot be distinguished -- '{0}'::text[] could be intended to affect
an array or an object with the key value "0" in one of its pairs.

Accordingly, I would vote for changing this, and making the nested
deletion stuff only care about object key values and array *string*
elements. This backs out of the idea of a new "shadow type" system for
text arrays. I think that this is conceptually a lot cleaner, while
actually not being significantly less useful for most use cases
(nesting tends to involve only objects anyway). As noted in my summary
of the previous section, I would also vote for scraping "operator
jsonb - text[]" as a jsonb_delete() wrapper (see closing summary below
for more on why).

While I appreciate that Andrew wanted to make deletion as flexible as
possible, these inconsistencies feel arbitrary to me.

Closing summary
=============

At the very least, some of these new jsonb operators need to decide if they're:

1) Specifically "array-wise" or "object-wise", like the existing
operators "operator -> integer", or  "operator -> text".

or:

2) "Key" operators.  Operators that share the "operator jsonb ? text"
idea of a key, and operate on jsonb datums accordingly. "Keys" here
include array elements that are strings.

As I've said, I prefer option 2 for deletion, which is why I like
"operator jsonb - text" but not "operator jsonb - text[]", but either
way it needs to be *a lot* clearer than it is at the moment.

You might also say that there is a category 3 jsonb operator, of which
the containment operator ("operator @>") is the best example. It takes
jsonb on the rhs, and so naturally cares about types including
container types on the lhs. Clearly these new operators are far enough
away from category 3 that we cannot really contemplate moving them
into category 3, particularly post feature freeze (even though, as I
said, I think that would be a superior approach).

Some of what I've said here is just my opinion, but I feel pretty
confident that for example we don't want to add a fourth "operator
category" to jsonb, a weird hybrid between category 2 and category 3.
Having 3 "operator categories" seems quite enough. And by making
*nested* deletion only possible through a function call to
jsonb_delete() (by scrapping "operator jsonb - text[]" as I suggest),
you also avoid having an *operator* that is still somewhat not like
other operators in category 2 (since the other operators in category 2
don't care about nesting). That feels cleaner -- IMV the oddness of
walking a path based on a text[] value ought to be confined to a well
named function. Plus you can then add a new "operator jsonb - text[]"
that matches "operator hstore - text[]" and comports with the existing
"operator jsonb - text".

If you don't like my taxonomy for jsonb operators, then feel free to
come up with your own. As things stand, it is hard to describe a
taxonomy that makes the operators easy to understand and
non-surprising -- there'd be more exceptions than rules. I feel we
need to be disciplined about this stuff, or jsonb will become much
harder to use than it needs to be.

Opinions?
--
Peter Geoghegan

Attachment

jsonb-delete-doc.patch

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

04 June 2015, 05:55:48

On Wed, Jun 3, 2015 at 7:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Consider this case:
>
> postgres=# select '{"c":5, "a":6, "b":7}'::jsonb - 1;
>      ?column?
> ------------------
>  {"a": 6, "c": 5}
> (1 row)
>
> Clearly anyone expecting the value "a" to be removed here would be in
> for a surprise. Moreover, it is inconsistent with the established
> behavior of the corresponding array-wise subscript operator:
>
> postgres=# select '{"c":5, "a":6, "b":7}'::jsonb -> 1;
>  ?column?
> ----------
>  [null]
> (1 row)

For similar reasons, I think that this inconsistency is unacceptable:

postgres=# select '["a", "b", "c"]'::jsonb - -1; ?column?
------------["a", "b"]
(1 row)

postgres=# select '["a", "b", "c"]'::jsonb -> -1;?column?
----------[null]
(1 row)

jsonb now supports Python-style negative subscripting to index
backward. I think that this a fine idea. However, I also think it's a
big POLA violation that this was not done for the ordinary array
subscripting operator ("operator jsonb -> integer") at the same time
as "operator jsonb - integer" was added. Although doing this will
require a compatibility note in the 9.5 release notes, it's extremely
unlikely to destabilize anybody's app, and makes a lot of sense.
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

04 June 2015, 13:44:11

On 06/03/2015 10:02 PM, Peter Geoghegan wrote:
> I've noticed some more issues with the jsonb documentation, and the
> new jsonb stuff generally. I didn't set out to give Andrew feedback on
> the semantics weeks after feature freeze, but unfortunately this feels
> like another discussion that we need to have now rather than later.

Yes, I wish you had raised these issues months ago when this was 
published. That's the way the process is supposed to work.

>
> "operator jsonb - integer"
> ===================
>
> Summary: I think that this operator has a problem, but a problem that
> can easily be fixed.
>
>
> I think it was a bad idea to allow array-style removal of object
> key/value pairs. ISTM that it implies a level of stability in the
> ordering that doesn't make sense. Besides, is it really all that
> useful?

The origin of this is nested hstore. Looking at my last version of that 
patch, I see:
   SELECT 'a=>1, b=>2, c=>3'::hstore - 3;            ?column?   ------------------------     "a"=>1, "b"=>2, "c"=>3
(1row)

But I agree that it's not a great contribution to science, especially 
since the index will be applied to the list of elements in the somewhat 
counter-intuitive storage order we use, and we could just raise an error 
if we try to apply integer delete to an object instead of an array.

>
> "operator jsonb - text[]" (and *nested* deletion more generally)
> ===============================================
>
> Summary: I think that this operator has many problems, and should be
> scraped (although only as an operator). IMV nested deletion should
> only be handled by functions, and the way that nested deletion works
> in general should be slightly adjusted.
>
>
> The new "operator jsonb - text[]" operator is confusingly inconsistent with:
>
> A) "operator jsonb text"

What exactly is this? I have no idea what you're talking about.

>
> and:
>
> B) the established "operator hstore - text[]" operator, since that
> operator deletes all key/value pairs that have keys that match any of
> the right operand text array values. In contrast, this new operator is
> passed as its right operand an array of text elements that constitute
> a "path" (so the order in the rhs text[] operand matters). If the text
> element in the rhs text[] operand happens to be what would pass for a
> Postgres integer literal, it can be used to traverse lhs array values
> through subscripting at that nesting level.

The fact that hstore uses it that way doesn't really concern me. Since 
hstore isn't nested it doesn't make a whole lot of sense for it to mean 
anything else there. But json(b) is nested, and jsonb - path seems quite 
a reasonable treatment, something you're much more likely to want to do 
than removeing top level elements in bulk.

>
> Regarding nested deletion behavior more generally, consider this
> example of how this can work out badly:
>
> postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}')  ;
>   jsonb_delete
> --------------
>   ["a", "b"]
> (1 row)
>
> Here, we're adding and then deleting an array element at offset 5 (the
> string "b"). But the element is never deleted by the outer
> jsonb_delete(), because we can't rely on the element actually being
> stored at offset 5. Seems a bit fragile.

The behaviour of jsonb_set is pretty explicitly documented. If we wanted 
to do something else then we'd have to disable the special meaning given 
to negative indices, but that would mean in turn we wouldn't be able to 
prepend to an array.

>
> More importantly, consider the inconsistency with "operator jsonb
> text" ("point A" above):
>
> postgres=# select '["a"]'::jsonb  ?| '{a}'::text[]; -- historic/9.4 behavior
>   ?column?
> ----------
>   t
> (1 row)
>
> postgres=# select '["a"]'::jsonb  - '{a}'::text[]; -- new to 9.5
> operator, does not delete!
>   ?column?
> ----------
>   ["a"]
> (1 row)

You are conflating two different things here, quite pointlessly. The RH 
operand of ?| is not a path, whereas the RH operand of this - variant 
is. The fact that they are both text arrays doesn't mean that they 
should mean the same thing. And this is really the whole problem with 
the rest of your analysis.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Jim Nasby

Date:

04 June 2015, 15:33:36

On 6/4/15 8:43 AM, Andrew Dunstan wrote:
> You are conflating two different things here, quite pointlessly. The RH
> operand of ?| is not a path, whereas the RH operand of this - variant
> is. The fact that they are both text arrays doesn't mean that they
> should mean the same thing. And this is really the whole problem with
> the rest of your analysis.

Has the idea of a specific json_path datatype been discussed? I feel it 
would add a lot of clarity to the operators. It would also make it easy 
to have an array of paths, something that's difficult to do today 
because a path can be an arbitrary length and arrays don't support that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

04 June 2015, 17:26:10

On 06/04/2015 11:33 AM, Jim Nasby wrote:
> On 6/4/15 8:43 AM, Andrew Dunstan wrote:
>> You are conflating two different things here, quite pointlessly. The RH
>> operand of ?| is not a path, whereas the RH operand of this - variant
>> is. The fact that they are both text arrays doesn't mean that they
>> should mean the same thing. And this is really the whole problem with
>> the rest of your analysis.
>
> Has the idea of a specific json_path datatype been discussed? I feel 
> it would add a lot of clarity to the operators. It would also make it 
> easy to have an array of paths, something that's difficult to do today 
> because a path can be an arbitrary length and arrays don't support that.

I actually thought of doing something like that earlier today, although 
I was thinking of making it an array under the hood - I'm not sure how 
much call there is for an array of paths. We could probably finesse 
that. I agree that there is some sense in having such a type, especially 
if we later want to implement json(b)_patch, see 
<http://tools.ietf.org/html/rfc6902>. And if we do we should call the 
type json_pointer to be consistent with 
<http://tools.ietf.org/html/rfc6901>.

However, this is certainly not 9.5 material.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

04 June 2015, 19:10:38

On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I've noticed some more issues with the jsonb documentation, and the
>> new jsonb stuff generally. I didn't set out to give Andrew feedback on
>> the semantics weeks after feature freeze, but unfortunately this feels
>> like another discussion that we need to have now rather than later.
>
> Yes, I wish you had raised these issues months ago when this was published.
> That's the way the process is supposed to work.

I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.

>> "operator jsonb - integer"
>> ===================

>> I think it was a bad idea to allow array-style removal of object
>> key/value pairs. ISTM that it implies a level of stability in the
>> ordering that doesn't make sense. Besides, is it really all that
>> useful?

> But I agree that it's not a great contribution to science, especially since
> the index will be applied to the list of elements in the somewhat
> counter-intuitive storage order we use, and we could just raise an error if
> we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

>> "operator jsonb - text[]" (and *nested* deletion more generally)
>> ===============================================
>>
>> Summary: I think that this operator has many problems, and should be
>> scraped (although only as an operator). IMV nested deletion should
>> only be handled by functions, and the way that nested deletion works
>> in general should be slightly adjusted.
>>
>>
>> The new "operator jsonb - text[]" operator is confusingly inconsistent
>> with:
>>
>> A) "operator jsonb text"
>
> What exactly is this? I have no idea what you're talking about.

It's a typo -- I meant "operator jsonb - text". The fact that
"operator jsonb - text" and "operator jsonb - text[]" diverge in the
way they do seems confusing.

> The fact that hstore uses it that way doesn't really concern me. Since
> hstore isn't nested it doesn't make a whole lot of sense for it to mean
> anything else there.

It seems pretty obvious to me that it makes just as much sense as in
hstore. In hstore, you might want to delete multiple key/value pairs
at once, for exactly the same reason as you might want to with jsonb.
Certainly, you'll also want to support nested deletion with jsonb, but
that's beside the point.

> But json(b) is nested, and jsonb - path seems quite a
> reasonable treatment, something you're much more likely to want to do than
> removeing top level elements in bulk.

Probably true. I think that this interface for nested deletion is
complicated enough and inconsistent enough that I'd rather not have an
operator at all, just a function (so somewhat like jsonb_set() --
jsonb_delete()).  That is my main point on "operator jsonb - text[]";
I think the interface is complicated and inconsistent with everything
else for no good reason.

>> Regarding nested deletion behavior more generally, consider this
>> example of how this can work out badly:
>>
>> postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}')  ;
>>   jsonb_delete
>> --------------
>>   ["a", "b"]
>> (1 row)
>>
>> Here, we're adding and then deleting an array element at offset 5 (the
>> string "b"). But the element is never deleted by the outer
>> jsonb_delete(), because we can't rely on the element actually being
>> stored at offset 5. Seems a bit fragile.
>
>
> The behaviour of jsonb_set is pretty explicitly documented. If we wanted to
> do something else then we'd have to disable the special meaning given to
> negative indices, but that would mean in turn we wouldn't be able to prepend
> to an array.

jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.

I think that traversing arrays in nested documents is a rare
requirement, because the ordering within arrays is unstable. If you
already know the ordinal number of the thing you want to nuke, then
you probably have already locked the row, and you might as well
manipulate the JSON using Javascript or Python at that stage.

Making jsonb_delete() buy into the "operator jsonb ? text" idea of a
key (a thing that it must delete) would also allow jsonb_delete() to
reliably delete particular strings in arrays, which actually does make
a lot of sense (think of arrays of "tags"). But FWIW it's the
inconsistency that bothers me most.

>> More importantly, consider the inconsistency with "operator jsonb
>> text" ("point A" above):
>>
>> postgres=# select '["a"]'::jsonb  ?| '{a}'::text[]; -- historic/9.4
>> behavior
>>   ?column?
>> ----------
>>   t
>> (1 row)
>>
>> postgres=# select '["a"]'::jsonb  - '{a}'::text[]; -- new to 9.5
>> operator, does not delete!
>>   ?column?
>> ----------
>>   ["a"]
>> (1 row)
>
> You are conflating two different things here, quite pointlessly. The RH
> operand of ?| is not a path, whereas the RH operand of this - variant is.
> The fact that they are both text arrays doesn't mean that they should mean
> the same thing. And this is really the whole problem with the rest of your
> analysis.

I'm not conflating anything -- obviously I understand that "operator
jsonb - text[]" is not supposed to be an "any" variant of "operator
jsonb - text". My point is that a reasonable person could infer that
it *is* an "any" variant of "operator jsonb - text", based on hstore's
behavior for its two deletion operators, and based on the other jsonb
operators, and based on the fact that it will often behave that way
anyway.

-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Alvaro Herrera

Date:

04 June 2015, 19:16:26

I'm just skimming here, but if a jsonb_path type is being proposed,
perhaps it would be better not to have operators that take text or
text[] as second argument.  We can provide that functionality with just
functions.  For example, it will be confusing to have 

jsonb 'some json value' - '{foo,bar}'

operate too differently from

jsonb 'some json value' - json_path '{foo,bar}'

And it will be a nasty regression to have 9.5 allow
jsonb 'some json value' - '{foo,bar}'
and then have 9.6 error out with "ambiguous operator" when the json_path
thing is added.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

04 June 2015, 20:02:12

On Thu, Jun 4, 2015 at 12:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I'm just skimming here, but if a jsonb_path type is being proposed,
> perhaps it would be better not to have operators that take text or
> text[] as second argument.  We can provide that functionality with just
> functions.  For example, it will be confusing to have
>
> jsonb 'some json value' - '{foo,bar}'
>
> operate too differently from
>
> jsonb 'some json value' - json_path '{foo,bar}'
>
> And it will be a nasty regression to have 9.5 allow
> jsonb 'some json value' - '{foo,bar}'
> and then have 9.6 error out with "ambiguous operator" when the json_path
> thing is added.

Fair point, but FWIW I don't think it'll end up being a new type like
json_path -- it'll just be jsonb, as with containment. I can see there
being an operator that performs deletion in a very similar way to how
"operator jsonb @> jsonb" performs containment (recall that jsonb
containment is a very JSON-ish flavor of containment).

I would like these new-to-9.5 deletion operators to work at the top
level only, like "operator jsonb ? text" and "operator jsonb ?| text",
sharing their idea of a key, __including that string array elements
are keys__. We haven't got a containment-style nested delete operator
for 9.5, but we can hope for it in the future. In the meantime, you
get much of the benefit of that with jsonb_delete(), which *can*
support nested deletion. It does so by buying into the "operator jsonb
? text" idea of a key (including that string array elements are keys),
although with a twist: the paths text[] right operand operates at
multiple nesting levels (not supporting traversing arrays, as Andrew
implemented it, but OTOH adding support for deleting String array
elements based on the string alone, useful for "tag" arrays).

If in 9.6 we have something like an "operator jsonb @- jsonb" operator
for containment style deletion, and a 9.5 era "operator jsonb - text"
and operator jsonb - text[]" pair of operators for existence style
deletion (matching "operator jsonb ? text", operating only on the top
level), that will be pretty good. The fact that jsonb_delete() will
have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
being usable through an operator) won't really matter then. I want to
keep the "twist" I described out of any jsonb operators that are
shipped, and only use it within functions.
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

04 June 2015, 20:09:29

On Thu, Jun 4, 2015 at 1:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I would like these new-to-9.5 deletion operators to work at the top
> level only, like "operator jsonb ? text" and "operator jsonb ?| text",
> sharing their idea of a key, __including that string array elements
> are keys__. We haven't got a containment-style nested delete operator
> for 9.5, but we can hope for it in the future. In the meantime, you
> get much of the benefit of that with jsonb_delete(), which *can*
> support nested deletion. It does so by buying into the "operator jsonb
> ? text" idea of a key (including that string array elements are keys),
> although with a twist: the paths text[] right operand operates at
> multiple nesting levels (not supporting traversing arrays, as Andrew
> implemented it, but OTOH adding support for deleting String array
> elements based on the string alone, useful for "tag" arrays).
>
> If in 9.6 we have something like an "operator jsonb @- jsonb" operator
> for containment style deletion, and a 9.5 era "operator jsonb - text"
> and operator jsonb - text[]" pair of operators for existence style
> deletion (matching "operator jsonb ? text", operating only on the top
> level), that will be pretty good. The fact that jsonb_delete() will
> have somewhat bridged the gap nesting-deletion-wise for 9.5 (without
> being usable through an operator) won't really matter then. I want to
> keep the "twist" I described out of any jsonb operators that are
> shipped, and only use it within functions.

To be clear: these two paragraphs are a proposal about how I'd like to
change things for 9.5 to make the jsonb operators more consistent than
the way things are in the master branch, while still offering nested
deletion through a function.

-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

"David E. Wheeler"

Date:

04 June 2015, 20:13:24

On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> I'm just skimming here, but if a jsonb_path type is being proposed,

Is this not the purpose of JSQuery?
 https://code.google.com/p/gwtquery/wiki/JsQuery

David

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

05 June 2015, 00:14:18

On 06/04/2015 04:13 PM, David E. Wheeler wrote:
> On Jun 4, 2015, at 12:16 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
>> I'm just skimming here, but if a jsonb_path type is being proposed,
> Is this not the purpose of JSQuery?
>
>    https://code.google.com/p/gwtquery/wiki/JsQuery
>

No, it doesn't seem to have anything at all to do with it. What I 
suggested would be an implementation of json_pointer - see 
<http://tools.ietf.org/html/rfc6901>

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

05 June 2015, 00:32:12

On 06/04/2015 03:10 PM, Peter Geoghegan wrote:
> On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> I've noticed some more issues with the jsonb documentation, and the
>>> new jsonb stuff generally. I didn't set out to give Andrew feedback on
>>> the semantics weeks after feature freeze, but unfortunately this feels
>>> like another discussion that we need to have now rather than later.
>> Yes, I wish you had raised these issues months ago when this was published.
>> That's the way the process is supposed to work.
> I also wish that I managed to do that. As you know, I was working
> overtime to get UPSERT into 9.5 during that period. Finding time to
> review things is always difficult, and I which I could do more.
>
>

That's happened to me in the past. My view has generally been that in 
that case I have missed my chance, and I need to live with what others 
have done. That seems to me preferable to tearing up any pretense we 
might have to be following a defined development process.

I should point out that I have already gone out of my way to accommodate 
concerns you expressed extremely late about this set of features, and I 
have lately indicated another area where we can adjust it to meet your 
objections. Re-litigating this wholesale seems quite a different kettle 
of fish, however.

Just in case it's not clear: I am not at all happy.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

05 June 2015, 00:43:34

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> jsonb_delete() should certainly be able to traverse objects, but it's
> much less clear that it should be able to *traverse* arrays (affecting
> arrays is a different story, though). That's why I proposed not
> supporting traversing arrays with it or with jsonb_set(). This would
> also removes the questionable second "shadow type system" within the
> text[] rhs operand too, which seems like a good thing.

Here is a further example of why I find this new "shadow type" system
for rhs text[] operands to be pretty questionable:

postgres=# select jsonb_set('[1, 2, 3, 4, 5,6,7,8,9,10,11,12]',
'{"5e10"}'::text[], '"Input unsanitized"') ;                        jsonb_set
-----------------------------------------------------------[1, 2, 3, 4, 5, "Input unsanitized", 7, 8, 9, 10, 11, 12]
(1 row)


BTW, there is a bug here -- strtol() needs additional defenses [1]
(before casting to int):

postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;                                   jsonb_set
----------------------------------------------------------------------------------[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12,13, 14, 15, 16, "Input
 
unsanitized", 18]
(1 row)

[1]
https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

05 June 2015, 00:50:43

On Thu, Jun 4, 2015 at 5:31 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Just in case it's not clear: I am not at all happy.

I've offered to help you with several of the issue I raised; I had
intended to offer more help.

The issues I raise seem pretty substantive to me. I'm trying to make
sure that we don't end up with something bad that we need to live with
indefinitely. I have offered you something not far off an "everybody
wins" proposal (i.e. no real loss of functionality), and that was my
first proposal.

I don't know what more I could do for you.  I *am* trying to help.
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

05 June 2015, 17:23:54

On 06/04/2015 03:16 PM, Alvaro Herrera wrote:
> I'm just skimming here, but if a jsonb_path type is being proposed,
> perhaps it would be better not to have operators that take text or
> text[] as second argument.  We can provide that functionality with just
> functions.  For example, it will be confusing to have
>
> jsonb 'some json value' - '{foo,bar}'
>
> operate too differently from
>
> jsonb 'some json value' - json_path '{foo,bar}'
>
> And it will be a nasty regression to have 9.5 allow
> jsonb 'some json value' - '{foo,bar}'
> and then have 9.6 error out with "ambiguous operator" when the json_path
> thing is added.
>

The boat has sailed on this. We have had the #> and #>> operators since 
9.3, i.e. even before we got the operators that Peter wants us to adopt 
the usage from, and their right hand operands are text arrays with the 
same path semantics.

'some jsonb value' - '{foo,bar}' is already ambiguous  - the RH operand 
could be a single text datum or a text array.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

05 June 2015, 17:39:57

On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> But I agree that it's not a great contribution to science, especially since
>> the index will be applied to the list of elements in the somewhat
>> counter-intuitive storage order we use, and we could just raise an error if
>> we try to apply integer delete to an object instead of an array.
>
> Cool. Do you want to write a patch, or should I?
>
> Also, what about negative array subscripting (making the 9.4-era
> "operator jsonb -> integer" operator support that for consistency with
> the new "operator jsonb - integer" operator)? Should I write the
> patch? Will you commit it if I do?

Please let me know if you want me to write these two patches.

-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

05 June 2015, 17:51:15

On 06/05/2015 01:39 PM, Peter Geoghegan wrote:
> On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> But I agree that it's not a great contribution to science, especially since
>>> the index will be applied to the list of elements in the somewhat
>>> counter-intuitive storage order we use, and we could just raise an error if
>>> we try to apply integer delete to an object instead of an array.
>> Cool. Do you want to write a patch, or should I?
>>
>> Also, what about negative array subscripting (making the 9.4-era
>> "operator jsonb -> integer" operator support that for consistency with
>> the new "operator jsonb - integer" operator)? Should I write the
>> patch? Will you commit it if I do?
> Please let me know if you want me to write these two patches.
>


Send the first one, I'm still thinking about the second one.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Alvaro Herrera

Date:

05 June 2015, 18:33:20

Andrew Dunstan wrote:
> 
> On 06/04/2015 03:16 PM, Alvaro Herrera wrote:
> >I'm just skimming here, but if a jsonb_path type is being proposed,
> >perhaps it would be better not to have operators that take text or
> >text[] as second argument.  We can provide that functionality with just
> >functions.  For example, it will be confusing to have
> >
> >jsonb 'some json value' - '{foo,bar}'
> >
> >operate too differently from
> >
> >jsonb 'some json value' - json_path '{foo,bar}'
> >
> >And it will be a nasty regression to have 9.5 allow
> >jsonb 'some json value' - '{foo,bar}'
> >and then have 9.6 error out with "ambiguous operator" when the json_path
> >thing is added.
> 
> The boat has sailed on this. We have had the #> and #>> operators since 9.3,
> i.e. even before we got the operators that Peter wants us to adopt the usage
> from, and their right hand operands are text arrays with the same path
> semantics.

Well, some boats sailed, but maybe those were different boats.  I don't
think we should shut discussion off only because we made some choice or
other in the past.  Since we haven't released yet, we can base decisions
on what's the most useful API for users, rather on what got committed in
the initial patch.

> 'some jsonb value' - '{foo,bar}' is already ambiguous  - the RH operand
> could be a single text datum or a text array.

Hmm, but that's not in 9.4, so we can still tweak it if necessary.

Consider this jsonb datum.  Nobody in their right mind would have a key
that looks like a path, I hear you say; yet I'm sure this is going to
happen.

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}": "uh"}' ;                       jsonb
          
 
------------------------------------------------------{"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
(1 fila)

This seems pretty surprising to me:

-- here, the -(jsonb,text) operator is silently chosen, even though the
-- right operand looks like an array.  And we do the wrong thing.
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';              ?column?                
---------------------------------------{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)

-- here, the -(jsonb,text[]) operator is chosen
alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  _text '{c,a}';          ?column?            
-------------------------------{"a": "1", "b": "2", "c": {}}
(1 fila)

But this seems worse to me, because we silently do nothing:

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';              ?column?                
---------------------------------------{"a": "1", "b": "2", "c": {"a": "2"}}
(1 fila)


I think the first operator can be qualified as dangerous.  If you delete
that one, then it's fine because you can't do that query anymore because
of the conflict with -(jsonb, int).

alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';
ERROR:  operator is not unique: jsonb - unknown
LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}'...
               ^
 
SUGERENCIA:  Could not choose a best candidate operator. You might need to add explicit type casts.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Further issues with jsonb semantics, documentation

From

Petr Jelinek

Date:

05 June 2015, 19:09:22

On Fri, Jun 5, 2015 at 8:32 , Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
> Andrew Dunstan wrote:
>
>>  'some jsonb value' - '{foo,bar}' is already ambiguous  - the RH
>> operand
>>  could be a single text datum or a text array.
>
> Hmm, but that's not in 9.4, so we can still tweak it if necessary.
>
> Consider this jsonb datum.  Nobody in their right mind would have a
> key
> that looks like a path, I hear you say; yet I'm sure this is going to
> happen.
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}":
> "uh"}' ;
>                         jsonb
> ------------------------------------------------------
>  {"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
> (1 fila)
>
> This seems pretty surprising to me:
>
> -- here, the -(jsonb,text) operator is silently chosen, even though
> the
> -- right operand looks like an array.  And we do the wrong thing.
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
> '{c,a}';
>                ?column?
> ---------------------------------------
>  {"a": "1", "b": "2", "c": {"a": "2"}}
> (1 fila)
>
> -- here, the -(jsonb,text[]) operator is chosen
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
> _text '{c,a}';
>            ?column?
> -------------------------------
>  {"a": "1", "b": "2", "c": {}}
> (1 fila)
>
> But this seems worse to me, because we silently do nothing:
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
> '{c,a}';
>                ?column?
> ---------------------------------------
>  {"a": "1", "b": "2", "c": {"a": "2"}}
> (1 fila)
>
>
> I think the first operator can be qualified as dangerous.  If you
> delete
> that one, then it's fine because you can't do that query anymore
> because
> of the conflict with -(jsonb, int).
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
> '{c,a}';
> ERROR:  operator is not unique: jsonb - unknown
> LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -
> '{c,a}'...
>                                                               ^
> SUGERENCIA:  Could not choose a best candidate operator. You might
> need to add explicit type casts.


That's a good point, and it won't get any better if/when we add the
json point support in 9.6 since the syntax would be something like
select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we
will again silently do nothing. That's going to cause bugs in
applications using this.

-- Petr Jelinek                  http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Further issues with jsonb semantics, documentation

From

Jim Nasby

Date:

05 June 2015, 19:31:06

On 6/5/15 2:08 PM, Petr Jelinek wrote:
> That's a good point, and it won't get any better if/when we add the json
> point support in 9.6 since the syntax would be something like select
> jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
> silently do nothing. That's going to cause bugs in applications using this.

Yeah, this is a miniature version of the pain I've felt with variant: 
trying to get sane casting for a data type that encompasses other types 
in current Postgres is essentially impossible. Your only option is to 
put implicit or assignment casts in and cross your fingers, or to do 
only explicit casts and force the user to cast everything (which is a 
PITA). Even a json_pointer type may not help this much unless we have 
some way to reliable transform an unknown into a json_pointer.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Further issues with jsonb semantics, documentation

From

Alvaro Herrera

Date:

05 June 2015, 19:52:12

Jim Nasby wrote:
> On 6/5/15 2:08 PM, Petr Jelinek wrote:
> >That's a good point, and it won't get any better if/when we add the json
> >point support in 9.6 since the syntax would be something like select
> >jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
> >silently do nothing. That's going to cause bugs in applications using this.
> 
> Yeah, this is a miniature version of the pain I've felt with variant: trying
> to get sane casting for a data type that encompasses other types in current
> Postgres is essentially impossible.

I'm not sure this is the same problem.  But anyway I think requiring
explicit casts in this stuff is a good thing -- relying on implicit
cast to text, when most useful behavior uses other types, seems bad.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

05 June 2015, 20:05:24

On 06/05/2015 02:32 PM, Alvaro Herrera wrote:
>> 'some jsonb value' - '{foo,bar}' is already ambiguous  - the RH operand
>> could be a single text datum or a text array.
> Hmm, but that's not in 9.4, so we can still tweak it if necessary.
>
> Consider this jsonb datum.  Nobody in their right mind would have a key
> that looks like a path, I hear you say; yet I'm sure this is going to
> happen.
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}, "{c,a}": "uh"}' ;
>                          jsonb
> ------------------------------------------------------
>   {"a": "1", "b": "2", "c": {"a": "2"}, "{c,a}": "uh"}
> (1 fila)
>
> This seems pretty surprising to me:
>
> -- here, the -(jsonb,text) operator is silently chosen, even though the
> -- right operand looks like an array.  And we do the wrong thing.
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';
>                 ?column?
> ---------------------------------------
>   {"a": "1", "b": "2", "c": {"a": "2"}}
> (1 fila)
>
> -- here, the -(jsonb,text[]) operator is chosen
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  _text '{c,a}';
>             ?column?
> -------------------------------
>   {"a": "1", "b": "2", "c": {}}
> (1 fila)
>
> But this seems worse to me, because we silently do nothing:
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';
>                 ?column?
> ---------------------------------------
>   {"a": "1", "b": "2", "c": {"a": "2"}}
> (1 fila)
>
>
> I think the first operator can be qualified as dangerous.  If you delete
> that one, then it's fine because you can't do that query anymore because
> of the conflict with -(jsonb, int).
>
> alvherre=# select jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}';
> ERROR:  operator is not unique: jsonb - unknown
> LÍNEA 1: ...elect jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' -  '{c,a}'...
>                                                                ^
> SUGERENCIA:  Could not choose a best candidate operator. You might need to add explicit type casts.



Yeah, Good point. Actually, if my memory serves me correctly (always a
dubious bet), the avoidance of that kind of ambiguity is why we
introduced the #> and #>> operators in the first place, after going
round and round for a while on what the API would look like. I should
have remembered that when this came around. Mea culpa.

So probably the least invasive change would be to rename the text[]
variant operator to something like "#-" and rename the corresponding
function to jsonb_delete_path.

We could also decide not to keep an operator at all, on the ground that
we think we'll implement a type that encapsulates json pointer in 9.6,
and just keep the renamed function.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Tom Lane

Date:

05 June 2015, 20:48:28

Andrew Dunstan <andrew@dunslane.net> writes:
> Yeah, Good point. Actually, if my memory serves me correctly (always a 
> dubious bet), the avoidance of that kind of ambiguity is why we 
> introduced the #> and #>> operators in the first place, after going 
> round and round for a while on what the API would look like. I should 
> have remembered that when this came around. Mea culpa.

> So probably the least invasive change would be to rename the text[] 
> variant operator to something like "#-" and rename the corresponding 
> function to jsonb_delete_path.

Not sure that's a great choice of operator name; consider for exampleselect 4#-1;
It's not immediately obvious whether the "-" is meant as a separate
unary minus.  There are heuristics in the lexer that try to deal with
cases like this, but it doesn't seem like a good plan to double down
on such heuristics always doing the right thing.
        regards, tom lane

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

05 June 2015, 22:13:59

On Fri, Jun 5, 2015 at 10:51 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Also, what about negative array subscripting (making the 9.4-era
>>> "operator jsonb -> integer" operator support that for consistency with
>>> the new "operator jsonb - integer" operator)? Should I write the
>>> patch? Will you commit it if I do?

> Send the first one, I'm still thinking about the second one.

The first patch is attached.

Regardless of anything else, I see no reason to delay applying my
documentation patch for "operator jsonb - text" [1].

Thanks

[1] http://www.postgresql.org/message-id/CAM3SWZQFSWMi2aVi-Lun_JBYh-RfHQ3-0fm8TXpW8OLc+v8ZnQ@mail.gmail.com
--
Peter Geoghegan

Attachment

0001-Desupport-jsonb-subscript-deletion-on-objects.patch

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

05 June 2015, 22:21:54

On Fri, Jun 5, 2015 at 1:05 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> So probably the least invasive change would be to rename the text[] variant
> operator to something like "#-" and rename the corresponding function to
> jsonb_delete_path.
>
> We could also decide not to keep an operator at all, on the ground that we
> think we'll implement a type that encapsulates json pointer in 9.6, and just
> keep the renamed function.

Obviously I prefer the latter option, but the former is still an
improvement. To repeat myself, ambiguities around operators are not
the only problem: It seems no good to me that there is no way to
accomplish an equivalent outcome to that shown below with the
similarly-spelled operator you talk about (that is, the operator
currently spelled "operator jsonb - text[]"):

postgres=# select '["a", "c", "a"]'::jsonb - 'a';?column?
----------["c"]
(1 row)

With the operator currently spelled "operator jsonb - text[]", at the
very least you have to do this instead:

postgres=# select '["a", "c", "a"]'::jsonb - '{0}'::text[] - '{1}'::text[];?column?
----------["c"]
(1 row)

If nothing else, these operators are too dissimilar for overloading to
be helpful.
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

06 June 2015, 14:13:45

On 06/05/2015 04:48 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Yeah, Good point. Actually, if my memory serves me correctly (always a
>> dubious bet), the avoidance of that kind of ambiguity is why we
>> introduced the #> and #>> operators in the first place, after going
>> round and round for a while on what the API would look like. I should
>> have remembered that when this came around. Mea culpa.
>> So probably the least invasive change would be to rename the text[]
>> variant operator to something like "#-" and rename the corresponding
>> function to jsonb_delete_path.
> Not sure that's a great choice of operator name; consider for example
>     select 4#-1;
> It's not immediately obvious whether the "-" is meant as a separate
> unary minus.  There are heuristics in the lexer that try to deal with
> cases like this, but it doesn't seem like a good plan to double down
> on such heuristics always doing the right thing.
>
>             


Perhaps we should deprectae operator names ending in "-"?

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

10 June 2015, 18:48:17

On 06/05/2015 01:51 PM, Andrew Dunstan wrote:
>
> On 06/05/2015 01:39 PM, Peter Geoghegan wrote:
>> On Thu, Jun 4, 2015 at 12:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>>> But I agree that it's not a great contribution to science, 
>>>> especially since
>>>> the index will be applied to the list of elements in the somewhat
>>>> counter-intuitive storage order we use, and we could just raise an 
>>>> error if
>>>> we try to apply integer delete to an object instead of an array.
>>> Cool. Do you want to write a patch, or should I?
>>>
>>> Also, what about negative array subscripting (making the 9.4-era
>>> "operator jsonb -> integer" operator support that for consistency with
>>> the new "operator jsonb - integer" operator)? Should I write the
>>> patch? Will you commit it if I do?
>> Please let me know if you want me to write these two patches.
>>
>
>
> Send the first one, I'm still thinking about the second one.
>

Sorry for the delay on this. I've been mostly off the grid, having an 
all too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what 
you were suggesting,

Please submit a patch to adjust the treatment of negative integers in 
the old functions to be consistent with their treatment in the new 
functions. i.e. in the range [-n,-1] they should refer to the 
corresponding element counting from the right.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

10 June 2015, 20:02:30

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Sorry for the delay on this. I've been mostly off the grid, having an all
> too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what you were
> suggesting,

Thank you for working with me to address this. I've been busy with
other things the past few days too.

> Please submit a patch to adjust the treatment of negative integers in the
> old functions to be consistent with their treatment in the new functions.
> i.e. in the range [-n,-1] they should refer to the corresponding element
> counting from the right.

I've almost finished that patch. I'm currently blocked on deciding
what to do about the old path-orientated operators (#> and #>> for
json and jsonb types). It's rather painful to support pushing down
negative subscripting there -- maybe we should just not do so for
those variants, especially given that they're already notationally
inconsistent with the other operators that I'll be updating. What do
you think?

Maybe I'll come up with a simpler way of making that work by taking a
fresh look at it, but haven't done that yet.

My current, draft approach to making subscripting work with the json
variants (not the jsonb variants) is to use a second get_worker() call
in the event of a negative subscript, while making the first such call
(the existing get_worker() call) establish the number of top-level
array elements. That isn't beautiful, and involves some amount of
redundant work, but it's probably better than messing with
get_worker() in a more invasive way. Besides, this second get_worker()
call only needs to happen in the event of a negative subscript, and
I'm only really updating this (that is, updating routines like
json_array_element()) to preserve consistency with jsonb. What do you
think of that idea?

-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

12 June 2015, 16:29:12

On 06/10/2015 04:02 PM, Peter Geoghegan wrote:
> On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Sorry for the delay on this. I've been mostly off the grid, having an all
>> too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what you were
>> suggesting,
> Thank you for working with me to address this. I've been busy with
> other things the past few days too.
>
>> Please submit a patch to adjust the treatment of negative integers in the
>> old functions to be consistent with their treatment in the new functions.
>> i.e. in the range [-n,-1] they should refer to the corresponding element
>> counting from the right.
> I've almost finished that patch. I'm currently blocked on deciding
> what to do about the old path-orientated operators (#> and #>> for
> json and jsonb types). It's rather painful to support pushing down
> negative subscripting there -- maybe we should just not do so for
> those variants, especially given that they're already notationally
> inconsistent with the other operators that I'll be updating. What do
> you think?
>
> Maybe I'll come up with a simpler way of making that work by taking a
> fresh look at it, but haven't done that yet.
>
> My current, draft approach to making subscripting work with the json
> variants (not the jsonb variants) is to use a second get_worker() call
> in the event of a negative subscript, while making the first such call
> (the existing get_worker() call) establish the number of top-level
> array elements. That isn't beautiful, and involves some amount of
> redundant work, but it's probably better than messing with
> get_worker() in a more invasive way. Besides, this second get_worker()
> call only needs to happen in the event of a negative subscript, and
> I'm only really updating this (that is, updating routines like
> json_array_element()) to preserve consistency with jsonb. What do you
> think of that idea?
>


Just took a quick look. My impression is that the jsonb case should be 
fairly easy. If the index is negative, add JB_ROOT_COUNT(container) to 
it and use that as the argument to getIthJsonbValueFromContainer().

I agree that the json case looks a bit nasty. Maybe a better approach 
would be to provide a function that, given a JsonLexContext, returns the 
number of array elements of the current array. In get_array_start we 
could call that if the relevant path element is negative and adjust it 
accordingly.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

12 June 2015, 19:26:52

On 06/12/2015 12:29 PM, I wrote:
>
>
>
> I agree that the json case looks a bit nasty. Maybe a better approach
> would be to provide a function that, given a JsonLexContext, returns
> the number of array elements of the current array. In get_array_start
> we could call that if the relevant path element is negative and adjust
> it accordingly.
>
>



Here's some code for the count piece of that.

cheers

andrew

Attachment

json_count_array.patch

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

12 June 2015, 20:28:11

On Fri, Jun 12, 2015 at 12:26 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Here's some code for the count piece of that.

Thanks. I'll look into integrating this with what I have.

BTW, on reflection I'm not so sure about my decision to not touch the
logic within jsonb_delete_idx() (commit b81c7b409). I probably should
have changed it in line with the attached patch as part of that
commit. What do you think?
--
Peter Geoghegan

Attachment

0001-Remove-dead-code-for-jsonb-subscript-deletion.patch

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

12 June 2015, 22:16:54

On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
>
> BTW, there is a bug here -- strtol() needs additional defenses [1]
> (before casting to int):
>
> postgres=# select jsonb_set('[1, 2, 3, 4,
> 5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
> '{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
>                                     jsonb_set
> ----------------------------------------------------------------------------------
>  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
> unsanitized", 18]
> (1 row)
>
> [1]
https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer

I attach a fix for this bug. The commit message explains everything.


--
Peter Geoghegan

Attachment

0001-Fix-path-infrastructure-bug-affecting-jsonb_set.patch

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

12 June 2015, 23:31:18

On 06/12/2015 06:16 PM, Peter Geoghegan wrote:
> On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> BTW, there is a bug here -- strtol() needs additional defenses [1]
>> (before casting to int):
>>
>> postgres=# select jsonb_set('[1, 2, 3, 4,
>> 5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
>> '{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
>>                                      jsonb_set
>> ----------------------------------------------------------------------------------
>>   [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
>> unsanitized", 18]
>> (1 row)
>>
>> [1]
https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer
> I attach a fix for this bug. The commit message explains everything.



OK, pushed, although you'd have to be trying really hard to break this. 
Still, it's reasonable to defend against.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

12 June 2015, 23:32:29

On Fri, Jun 12, 2015 at 4:31 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> OK, pushed, although you'd have to be trying really hard to break this.
> Still, it's reasonable to defend against.

I was trying really hard.   :-)

Thanks
-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Jim Nasby

Date:

22 June 2015, 18:54:49

On 6/5/15 3:51 PM, Alvaro Herrera wrote:
> Jim Nasby wrote:
>> On 6/5/15 2:08 PM, Petr Jelinek wrote:
>>> That's a good point, and it won't get any better if/when we add the json
>>> point support in 9.6 since the syntax would be something like select
>>> jsonb '{"a":"1", "b":"2", "c": {"a": "2"}}' - '/c/a'; and we will again
>>> silently do nothing. That's going to cause bugs in applications using this.
>>
>> Yeah, this is a miniature version of the pain I've felt with variant: trying
>> to get sane casting for a data type that encompasses other types in current
>> Postgres is essentially impossible.
>
> I'm not sure this is the same problem.  But anyway I think requiring
> explicit casts in this stuff is a good thing -- relying on implicit
> cast to text, when most useful behavior uses other types, seems bad.

I'm not sure I'm following, at least for jsonb. If there's only jsonb - 
json_pointer operator, why shouldn't we be able to resolve it? I suspect 
part of the answer to that problem is that we need to make the 
resolution of unknown smarter, or perhaps somehow configurable.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

23 June 2015, 01:19:56

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Please submit a patch to adjust the treatment of negative integers in the
> old functions to be consistent with their treatment in the new functions.
> i.e. in the range [-n,-1] they should refer to the corresponding element
> counting from the right.

This patch is attached, along with a separate patch which adds a
release note compatibility item.

See commit message for full details.

Thanks
--
Peter Geoghegan

Attachment

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

09 July 2015, 08:10:31

On Mon, Jun 22, 2015 at 6:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Please submit a patch to adjust the treatment of negative integers in the
>> old functions to be consistent with their treatment in the new functions.
>> i.e. in the range [-n,-1] they should refer to the corresponding element
>> counting from the right.
>
> This patch is attached, along with a separate patch which adds a
> release note compatibility item.

Where are we on this? This is currently a 9.5 release blocker.

-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

09 July 2015, 14:49:40

On 07/09/2015 04:10 AM, Peter Geoghegan wrote:
> On Mon, Jun 22, 2015 at 6:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Please submit a patch to adjust the treatment of negative integers in the
>>> old functions to be consistent with their treatment in the new functions.
>>> i.e. in the range [-n,-1] they should refer to the corresponding element
>>> counting from the right.
>> This patch is attached, along with a separate patch which adds a
>> release note compatibility item.
> Where are we on this? This is currently a 9.5 release blocker.
>

I am on vacation right now, but I might have some time tomorrow to deal 
with it. If not, it will be Sunday or Monday when I get to it.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

09 July 2015, 21:32:52

On Thu, Jul 9, 2015 at 7:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> I am on vacation right now, but I might have some time tomorrow to deal with
> it. If not, it will be Sunday or Monday when I get to it.

Okay. Thanks.


-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Robert Haas

Date:

17 July 2015, 18:37:47

On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Where are we on this? This is currently a 9.5 release blocker.
>
> I am on vacation right now, but I might have some time tomorrow to deal with
> it. If not, it will be Sunday or Monday when I get to it.

Is this still pending?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

17 July 2015, 18:48:27

On Fri, Jul 17, 2015 at 11:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I am on vacation right now, but I might have some time tomorrow to deal with
>> it. If not, it will be Sunday or Monday when I get to it.
>
> Is this still pending?

Yes.


-- 
Peter Geoghegan

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

17 July 2015, 18:49:57

On 07/17/2015 02:37 PM, Robert Haas wrote:
> On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Where are we on this? This is currently a 9.5 release blocker.
>> I am on vacation right now, but I might have some time tomorrow to deal with
>> it. If not, it will be Sunday or Monday when I get to it.
> Is this still pending?
>

Yes, been tied up a bit unexpectedly this week, am just getting down to 
it right now.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Andrew Dunstan

Date:

18 July 2015, 01:20:09

On 07/17/2015 02:49 PM, Andrew Dunstan wrote:
>
> On 07/17/2015 02:37 PM, Robert Haas wrote:
>> On Thu, Jul 9, 2015 at 10:49 AM, Andrew Dunstan <andrew@dunslane.net> 
>> wrote:
>>>> Where are we on this? This is currently a 9.5 release blocker.
>>> I am on vacation right now, but I might have some time tomorrow to 
>>> deal with
>>> it. If not, it will be Sunday or Monday when I get to it.
>> Is this still pending?
>>
>
> Yes, been tied up a bit unexpectedly this week, am just getting down 
> to it right now.
>
>


OK, I have committed this and updated the open issues list on the wiki.

cheers

andrew

Re: Further issues with jsonb semantics, documentation

From

Peter Geoghegan

Date:

18 July 2015, 01:50:04

On Fri, Jul 17, 2015 at 6:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> OK, I have committed this and updated the open issues list on the wiki.

Thanks, Andrew.


-- 
Peter Geoghegan