Re: Further issues with jsonb semantics, documentation - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Further issues with jsonb semantics, documentation
Date
Msg-id CAM3SWZTzO2+5trqyDM=ggAWgSx5iakkoYXE2fV10yxmzpi8vBQ@mail.gmail.com
Whole thread Raw
In response to Re: Further issues with jsonb semantics, documentation  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Further issues with jsonb semantics, documentation  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: Further issues with jsonb semantics, documentation  (Andrew Dunstan <andrew@dunslane.net>)
Re: Further issues with jsonb semantics, documentation  (Peter Geoghegan <pg@heroku.com>)
Re: Further issues with jsonb semantics, documentation  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Thu, Jun 4, 2015 at 6:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I've noticed some more issues with the jsonb documentation, and the
>> new jsonb stuff generally. I didn't set out to give Andrew feedback on
>> the semantics weeks after feature freeze, but unfortunately this feels
>> like another discussion that we need to have now rather than later.
>
> Yes, I wish you had raised these issues months ago when this was published.
> That's the way the process is supposed to work.

I also wish that I managed to do that. As you know, I was working
overtime to get UPSERT into 9.5 during that period. Finding time to
review things is always difficult, and I which I could do more.

>> "operator jsonb - integer"
>> ===================

>> I think it was a bad idea to allow array-style removal of object
>> key/value pairs. ISTM that it implies a level of stability in the
>> ordering that doesn't make sense. Besides, is it really all that
>> useful?

> But I agree that it's not a great contribution to science, especially since
> the index will be applied to the list of elements in the somewhat
> counter-intuitive storage order we use, and we could just raise an error if
> we try to apply integer delete to an object instead of an array.

Cool. Do you want to write a patch, or should I?

Also, what about negative array subscripting (making the 9.4-era
"operator jsonb -> integer" operator support that for consistency with
the new "operator jsonb - integer" operator)? Should I write the
patch? Will you commit it if I do?

>> "operator jsonb - text[]" (and *nested* deletion more generally)
>> ===============================================
>>
>> Summary: I think that this operator has many problems, and should be
>> scraped (although only as an operator). IMV nested deletion should
>> only be handled by functions, and the way that nested deletion works
>> in general should be slightly adjusted.
>>
>>
>> The new "operator jsonb - text[]" operator is confusingly inconsistent
>> with:
>>
>> A) "operator jsonb text"
>
> What exactly is this? I have no idea what you're talking about.

It's a typo -- I meant "operator jsonb - text". The fact that
"operator jsonb - text" and "operator jsonb - text[]" diverge in the
way they do seems confusing.

> The fact that hstore uses it that way doesn't really concern me. Since
> hstore isn't nested it doesn't make a whole lot of sense for it to mean
> anything else there.

It seems pretty obvious to me that it makes just as much sense as in
hstore. In hstore, you might want to delete multiple key/value pairs
at once, for exactly the same reason as you might want to with jsonb.
Certainly, you'll also want to support nested deletion with jsonb, but
that's beside the point.

> But json(b) is nested, and jsonb - path seems quite a
> reasonable treatment, something you're much more likely to want to do than
> removeing top level elements in bulk.

Probably true. I think that this interface for nested deletion is
complicated enough and inconsistent enough that I'd rather not have an
operator at all, just a function (so somewhat like jsonb_set() --
jsonb_delete()).  That is my main point on "operator jsonb - text[]";
I think the interface is complicated and inconsistent with everything
else for no good reason.

>> Regarding nested deletion behavior more generally, consider this
>> example of how this can work out badly:
>>
>> postgres=# select jsonb_delete(jsonb_set('["a"]', '{5}', '"b"'), '{5}')  ;
>>   jsonb_delete
>> --------------
>>   ["a", "b"]
>> (1 row)
>>
>> Here, we're adding and then deleting an array element at offset 5 (the
>> string "b"). But the element is never deleted by the outer
>> jsonb_delete(), because we can't rely on the element actually being
>> stored at offset 5. Seems a bit fragile.
>
>
> The behaviour of jsonb_set is pretty explicitly documented. If we wanted to
> do something else then we'd have to disable the special meaning given to
> negative indices, but that would mean in turn we wouldn't be able to prepend
> to an array.

jsonb_delete() should certainly be able to traverse objects, but it's
much less clear that it should be able to *traverse* arrays (affecting
arrays is a different story, though). That's why I proposed not
supporting traversing arrays with it or with jsonb_set(). This would
also removes the questionable second "shadow type system" within the
text[] rhs operand too, which seems like a good thing.

I think that traversing arrays in nested documents is a rare
requirement, because the ordering within arrays is unstable. If you
already know the ordinal number of the thing you want to nuke, then
you probably have already locked the row, and you might as well
manipulate the JSON using Javascript or Python at that stage.

Making jsonb_delete() buy into the "operator jsonb ? text" idea of a
key (a thing that it must delete) would also allow jsonb_delete() to
reliably delete particular strings in arrays, which actually does make
a lot of sense (think of arrays of "tags"). But FWIW it's the
inconsistency that bothers me most.

>> More importantly, consider the inconsistency with "operator jsonb
>> text" ("point A" above):
>>
>> postgres=# select '["a"]'::jsonb  ?| '{a}'::text[]; -- historic/9.4
>> behavior
>>   ?column?
>> ----------
>>   t
>> (1 row)
>>
>> postgres=# select '["a"]'::jsonb  - '{a}'::text[]; -- new to 9.5
>> operator, does not delete!
>>   ?column?
>> ----------
>>   ["a"]
>> (1 row)
>
> You are conflating two different things here, quite pointlessly. The RH
> operand of ?| is not a path, whereas the RH operand of this - variant is.
> The fact that they are both text arrays doesn't mean that they should mean
> the same thing. And this is really the whole problem with the rest of your
> analysis.

I'm not conflating anything -- obviously I understand that "operator
jsonb - text[]" is not supposed to be an "any" variant of "operator
jsonb - text". My point is that a reasonable person could infer that
it *is* an "any" variant of "operator jsonb - text", based on hstore's
behavior for its two deletion operators, and based on the other jsonb
operators, and based on the fact that it will often behave that way
anyway.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Neil Tiffin
Date:
Subject: Re: RFC: Remove contrib entirely
Next
From: Alvaro Herrera
Date:
Subject: Re: Further issues with jsonb semantics, documentation