Thread: jsonb concatenate operator's semantics seem questionable

jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
Another thing that I noticed about the new jsonb stuff is that the
concatenate operator is based on the hstore one. This works as
expected:

postgres=# select '{"a":1}'::jsonb || '{"a":2}';
?column?
----------
 {"a": 2}
(1 row)

However, the nesting doesn't "match up" -- containers are not merged
beyond the least-nested level:

postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also nested":2}}';
         ?column?
---------------------------
 {"a": {"also nested": 2}}
(1 row)

This feels wrong to me. When jsonb was initially introduced, we took
inspiration for the *containment* ("operator @> jsonb") semantics from
hstore, but since jsonb is nested it worked in a nested fashion. At
the top level and with no nested containers there was no real
difference, but we had to consider the behavior of more nested levels
carefully (the containment operator is clearly the most important
jsonb operator). I had envisaged that with the concatenation of jsonb,
concatenation would similarly behave in a nested fashion. Under this
scheme, the above query would perform nested concatenation as follows:

postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
nested":2}}'; -- does not match actual current behavior
         ?column?
---------------------------
 {"a": {"nested":1, "also nested": 2}}
(1 row)

Now, I think it's good that the minus operator ("operator - text" and
friends) discussed on the nearby thread accepts a text (or int)
argument and remove string elements/pairs at the top level only. This
works exactly the same as existence (I happen to think that removing
elements/pairs at a nested level is likely to be more trouble than
it's worth, and so I don't really like the new "jsonb - text[]"
operator much, because it accepts a Postgres (not JSON) array of texts
that constitute a path, which feels odd). So I have no issue with at
least the plain minus operators' semantics. But I think that the
concatenate operator's current semantics are significantly less useful
than they could be, and are not consistent with the overall design of
jsonb.

I'm particularly concerned about a table containing many homogeneously
structured, deeply nested jsonb datums (think of the delicious URLs
dataset that jsonb was originally tested using for a good example of
that -- this is quite representative of how people use jsonb in the
real world). It would be almost impossible to perform insert-or-update
type operations to these deeply nested elements using hstore style
concatenation. You'd almost invariably end up removing a bunch of
irrelevant nested values of the documents, when you only intended to
update one deeply nested value.

Looking back at the discussion of the new jsonb stuff, a concern was
raised along these lines by Ilya Ashchepkov [1], but this was
dismissed. I feel pretty strongly that this should be revisited. I'm
willing to concede that we might not want to always merge containers
that are found in the same position during concatenation, but I think
it's more likely that we do. As with containment, my sense is that
there should be nothing special about the nesting level -- it should
not influence whether we merge rather than overwrite the operator's
lhs container (with or into the rhs container). Not everyone will
agree with this [2].

I'm sorry that I didn't get to this sooner, but I was rather busy when
it was being discussed.

[1] http://www.postgresql.org/message-id/55006879.2050601@dunslane.net
[2] http://www.postgresql.org/message-id/54EF61DD.7040208@agliodbs.com
--
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Robert Haas
Date:
> On May 16, 2015, at 10:56 PM, Peter Geoghegan <pg@heroku.com> wrote:
>
> Another thing that I noticed about the new jsonb stuff is that the
> concatenate operator is based on the hstore one. This works as
> expected:
>
> postgres=# select '{"a":1}'::jsonb || '{"a":2}';
> ?column?
> ----------
> {"a": 2}
> (1 row)
>
> However, the nesting doesn't "match up" -- containers are not merged
> beyond the least-nested level:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also nested":2}}';
>         ?column?
> ---------------------------
> {"a": {"also nested": 2}}
> (1 row)
>
> This feels wrong to me. When jsonb was initially introduced, we took
> inspiration for the *containment* ("operator @> jsonb") semantics from
> hstore, but since jsonb is nested it worked in a nested fashion. At
> the top level and with no nested containers there was no real
> difference, but we had to consider the behavior of more nested levels
> carefully (the containment operator is clearly the most important
> jsonb operator). I had envisaged that with the concatenation of jsonb,
> concatenation would similarly behave in a nested fashion. Under this
> scheme, the above query would perform nested concatenation as follows:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
> nested":2}}'; -- does not match actual current behavior
>         ?column?
> ---------------------------
> {"a": {"nested":1, "also nested": 2}}
> (1 row)

I'm not as much of a JSON user as some here, for sure, but for what it's worth my intuition here matches yours.

...Robert


Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/16/2015 10:56 PM, Peter Geoghegan wrote:
> Another thing that I noticed about the new jsonb stuff is that the
> concatenate operator is based on the hstore one. This works as
> expected:
>
> postgres=# select '{"a":1}'::jsonb || '{"a":2}';
> ?column?
> ----------
>   {"a": 2}
> (1 row)
>
> However, the nesting doesn't "match up" -- containers are not merged
> beyond the least-nested level:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also nested":2}}';
>           ?column?
> ---------------------------
>   {"a": {"also nested": 2}}
> (1 row)
>
> This feels wrong to me. When jsonb was initially introduced, we took
> inspiration for the *containment* ("operator @> jsonb") semantics from
> hstore, but since jsonb is nested it worked in a nested fashion. At
> the top level and with no nested containers there was no real
> difference, but we had to consider the behavior of more nested levels
> carefully (the containment operator is clearly the most important
> jsonb operator). I had envisaged that with the concatenation of jsonb,
> concatenation would similarly behave in a nested fashion. Under this
> scheme, the above query would perform nested concatenation as follows:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
> nested":2}}'; -- does not match actual current behavior
>           ?column?
> ---------------------------
>   {"a": {"nested":1, "also nested": 2}}
> (1 row)
>
> Now, I think it's good that the minus operator ("operator - text" and
> friends) discussed on the nearby thread accepts a text (or int)
> argument and remove string elements/pairs at the top level only. This
> works exactly the same as existence (I happen to think that removing
> elements/pairs at a nested level is likely to be more trouble than
> it's worth, and so I don't really like the new "jsonb - text[]"
> operator much, because it accepts a Postgres (not JSON) array of texts
> that constitute a path, which feels odd). So I have no issue with at
> least the plain minus operators' semantics. But I think that the
> concatenate operator's current semantics are significantly less useful
> than they could be, and are not consistent with the overall design of
> jsonb.
>
> I'm particularly concerned about a table containing many homogeneously
> structured, deeply nested jsonb datums (think of the delicious URLs
> dataset that jsonb was originally tested using for a good example of
> that -- this is quite representative of how people use jsonb in the
> real world). It would be almost impossible to perform insert-or-update
> type operations to these deeply nested elements using hstore style
> concatenation. You'd almost invariably end up removing a bunch of
> irrelevant nested values of the documents, when you only intended to
> update one deeply nested value.
>
> Looking back at the discussion of the new jsonb stuff, a concern was
> raised along these lines by Ilya Ashchepkov [1], but this was
> dismissed. I feel pretty strongly that this should be revisited. I'm
> willing to concede that we might not want to always merge containers
> that are found in the same position during concatenation, but I think
> it's more likely that we do. As with containment, my sense is that
> there should be nothing special about the nesting level -- it should
> not influence whether we merge rather than overwrite the operator's
> lhs container (with or into the rhs container). Not everyone will
> agree with this [2].
>
> I'm sorry that I didn't get to this sooner, but I was rather busy when
> it was being discussed.
>
> [1] http://www.postgresql.org/message-id/55006879.2050601@dunslane.net
> [2] http://www.postgresql.org/message-id/54EF61DD.7040208@agliodbs.com

Historical note: I think it's based on the nested hstore work, not on 
current hstore, but Dmitry can answer on that.

I didn't dismiss this because it was a bad idea, but because it was too 
late in the process. If there is a consensus that we need to address 
this now then I'm happy to reopen that, but given the recent amount of 
angst about process I'm certainly not going to make such a decision 
unilaterally.

Personally, I think there is plenty of room for both operations, and I 
can see use cases for both. If I were designing I'd leave || as it is 
now and add a + operation to do a recursive merge. I'm not sure how much 
work that would be. Not huge but not trivial either.

cheers

andrew






Re: jsonb concatenate operator's semantics seem questionable

From
Petr Jelinek
Date:
On 17/05/15 16:04, Andrew Dunstan wrote:
>
> On 05/16/2015 10:56 PM, Peter Geoghegan wrote:
>> Another thing that I noticed about the new jsonb stuff is that the
>> concatenate operator is based on the hstore one. This works as
>> expected:
>>
>> postgres=# select '{"a":1}'::jsonb || '{"a":2}';
>> ?column?
>> ----------
>>   {"a": 2}
>> (1 row)
>>
>> However, the nesting doesn't "match up" -- containers are not merged
>> beyond the least-nested level:
>>
>> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
>> nested":2}}';
>>           ?column?
>> ---------------------------
>>   {"a": {"also nested": 2}}
>> (1 row)
>>
>> This feels wrong to me. When jsonb was initially introduced, we took
>> inspiration for the *containment* ("operator @> jsonb") semantics from
>> hstore, but since jsonb is nested it worked in a nested fashion. At
>> the top level and with no nested containers there was no real
>> difference, but we had to consider the behavior of more nested levels
>> carefully (the containment operator is clearly the most important
>> jsonb operator). I had envisaged that with the concatenation of jsonb,
>> concatenation would similarly behave in a nested fashion. Under this
>> scheme, the above query would perform nested concatenation as follows:
>>
>> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
>> nested":2}}'; -- does not match actual current behavior
>>           ?column?
>> ---------------------------
>>   {"a": {"nested":1, "also nested": 2}}
>> (1 row)
>>
>> Now, I think it's good that the minus operator ("operator - text" and
>> friends) discussed on the nearby thread accepts a text (or int)
>> argument and remove string elements/pairs at the top level only. This
>> works exactly the same as existence (I happen to think that removing
>> elements/pairs at a nested level is likely to be more trouble than
>> it's worth, and so I don't really like the new "jsonb - text[]"
>> operator much, because it accepts a Postgres (not JSON) array of texts
>> that constitute a path, which feels odd). So I have no issue with at
>> least the plain minus operators' semantics. But I think that the
>> concatenate operator's current semantics are significantly less useful
>> than they could be, and are not consistent with the overall design of
>> jsonb.
>>
>
> Historical note: I think it's based on the nested hstore work, not on
> current hstore, but Dmitry can answer on that.
>
> I didn't dismiss this because it was a bad idea, but because it was too
> late in the process. If there is a consensus that we need to address
> this now then I'm happy to reopen that, but given the recent amount of
> angst about process I'm certainly not going to make such a decision
> unilaterally.
>
> Personally, I think there is plenty of room for both operations, and I
> can see use cases for both. If I were designing I'd leave || as it is
> now and add a + operation to do a recursive merge. I'm not sure how much
> work that would be. Not huge but not trivial either.
>

Agreed, if you look at jquery for example, the extend() method by 
default behaves like our current || and you have to specify that you 
want deep merge if you want the behavior described by Peter. So there is 
definitely point for both, at this time we just support only one of 
them, that's all.


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: jsonb concatenate operator's semantics seem questionable

From
Dmitry Dolgov
Date:
Historical note: I think it's based on the nested hstore work, not on current hstore, but Dmitry can answer on that.

Yes, you're right.
And I agree with thoughts above, that both concatenation modes ("simple" and "deep") definitely can be useful. I can try to figure out how much work that would be to modify the IteratorConcat function (or adapt Ilya's solution)

On 17 May 2015 at 21:16, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 17/05/15 16:04, Andrew Dunstan wrote:

On 05/16/2015 10:56 PM, Peter Geoghegan wrote:
Another thing that I noticed about the new jsonb stuff is that the
concatenate operator is based on the hstore one. This works as
expected:

postgres=# select '{"a":1}'::jsonb || '{"a":2}';
?column?
----------
  {"a": 2}
(1 row)

However, the nesting doesn't "match up" -- containers are not merged
beyond the least-nested level:

postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
nested":2}}';
          ?column?
---------------------------
  {"a": {"also nested": 2}}
(1 row)

This feels wrong to me. When jsonb was initially introduced, we took
inspiration for the *containment* ("operator @> jsonb") semantics from
hstore, but since jsonb is nested it worked in a nested fashion. At
the top level and with no nested containers there was no real
difference, but we had to consider the behavior of more nested levels
carefully (the containment operator is clearly the most important
jsonb operator). I had envisaged that with the concatenation of jsonb,
concatenation would similarly behave in a nested fashion. Under this
scheme, the above query would perform nested concatenation as follows:

postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
nested":2}}'; -- does not match actual current behavior
          ?column?
---------------------------
  {"a": {"nested":1, "also nested": 2}}
(1 row)

Now, I think it's good that the minus operator ("operator - text" and
friends) discussed on the nearby thread accepts a text (or int)
argument and remove string elements/pairs at the top level only. This
works exactly the same as existence (I happen to think that removing
elements/pairs at a nested level is likely to be more trouble than
it's worth, and so I don't really like the new "jsonb - text[]"
operator much, because it accepts a Postgres (not JSON) array of texts
that constitute a path, which feels odd). So I have no issue with at
least the plain minus operators' semantics. But I think that the
concatenate operator's current semantics are significantly less useful
than they could be, and are not consistent with the overall design of
jsonb.


Historical note: I think it's based on the nested hstore work, not on
current hstore, but Dmitry can answer on that.

I didn't dismiss this because it was a bad idea, but because it was too
late in the process. If there is a consensus that we need to address
this now then I'm happy to reopen that, but given the recent amount of
angst about process I'm certainly not going to make such a decision
unilaterally.

Personally, I think there is plenty of room for both operations, and I
can see use cases for both. If I were designing I'd leave || as it is
now and add a + operation to do a recursive merge. I'm not sure how much
work that would be. Not huge but not trivial either.


Agreed, if you look at jquery for example, the extend() method by default behaves like our current || and you have to specify that you want deep merge if you want the behavior described by Peter. So there is definitely point for both, at this time we just support only one of them, that's all.


--
 Petr Jelinek                  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Sun, May 17, 2015 at 7:16 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> Agreed, if you look at jquery for example, the extend() method by default
> behaves like our current || and you have to specify that you want deep merge
> if you want the behavior described by Peter. So there is definitely point
> for both, at this time we just support only one of them, that's all.

The difference is that with jquery's extend(), you can easily
subscript the JSON document you're extending so that you do so for the
right nested object (without merging any containers the target
contains, but rather overwriting them with any of the source's
containers -- makes sense when you're explicit about the nesting level
like that). With jsonb, however, we're usually stuck with having to
write an SQL expression that evaluates to the final jsonb document
that we want (for UPDATE targetlist assignment, typically).

This is an enormous difference. We still don't have a way to update a
nested object's single field that I can see, and nested objects are
absolutely commonplace for the "document database" use case. So I
don't accept that this is a matter of individual preference or taste.
It's a big, practical distinction.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Sun, May 17, 2015 at 8:37 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> And I agree with thoughts above, that both concatenation modes ("simple" and
> "deep") definitely can be useful. I can try to figure out how much work that
> would be to modify the IteratorConcat function (or adapt Ilya's solution)

I tend to think of it like this: jsonb more or less supports two use cases.

Firstly, it supports the hstore use case, with a heterogeneous object
stored in each row -- a hodge-podge of different attributes, which can
be used to do something EAV-like, where there are application end user
defined attributes, say. This is unlikely to involve any nesting,
because we're only storing attributes of one entity (the row). This
isn't the use of jsonb that people got excited about, and I think it's
less important, although it does matter. The existence operator (which
operates at the least nested level) is firmly about this use case. And
for that matter, so is the new remove capability/minus operator thing
(which also operates at the least nested level). Fine.

The second use case is the "document database" use case, which is
where jsonb is really compelling. Think of storing more or less fixed
structure documents from a third party web API. Containment works in a
nested fashion in support of that. And as I pointed out, not having
the concatenate operator work in a nested fashion hobbles this use
case. How are users supposed to write an SQL query that update's a
single field in a nested object? That's obviously what they expect
from this.

I think it's misguided to make the concatenate operator target the
hstore use case - if you have that use case, you're unlikely to have
any nesting by convention anyway, and so it doesn't matter to you.
Besides which, as I said, the "document database" use case is the one
most users actually care about these days.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/17/2015 05:56 PM, Peter Geoghegan wrote:
> On Sun, May 17, 2015 at 8:37 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>> And I agree with thoughts above, that both concatenation modes ("simple" and
>> "deep") definitely can be useful. I can try to figure out how much work that
>> would be to modify the IteratorConcat function (or adapt Ilya's solution)
> I tend to think of it like this: jsonb more or less supports two use cases.
>
> Firstly, it supports the hstore use case, with a heterogeneous object
> stored in each row -- a hodge-podge of different attributes, which can
> be used to do something EAV-like, where there are application end user
> defined attributes, say. This is unlikely to involve any nesting,
> because we're only storing attributes of one entity (the row). This
> isn't the use of jsonb that people got excited about, and I think it's
> less important, although it does matter. The existence operator (which
> operates at the least nested level) is firmly about this use case. And
> for that matter, so is the new remove capability/minus operator thing
> (which also operates at the least nested level). Fine.
>
> The second use case is the "document database" use case, which is
> where jsonb is really compelling. Think of storing more or less fixed
> structure documents from a third party web API. Containment works in a
> nested fashion in support of that. And as I pointed out, not having
> the concatenate operator work in a nested fashion hobbles this use
> case. How are users supposed to write an SQL query that update's a
> single field in a nested object? That's obviously what they expect
> from this.
>
> I think it's misguided to make the concatenate operator target the
> hstore use case - if you have that use case, you're unlikely to have
> any nesting by convention anyway, and so it doesn't matter to you.
> Besides which, as I said, the "document database" use case is the one
> most users actually care about these days.
>

Peter,

Nobody is arguing that what you want isn't desirable. It just happens to 
be what we don't have. I get your disappointment, but you actually had a 
long time after the original patch was published to make your case. When 
I suggested 2 months ago to someone else that it was really too late to 
be adding this feature, nobody, including you, disagreed.

So what exactly do you want me or anybody else to do now, two days 
*after* we declared (not without pain) feature freeze?

If there is a consensus that what you want is so important that we need 
to implement the new behaviour at this late stage, I'm happy to spend 
time on it if there's a patch forthcoming. I might add that this should 
be an additional behaviour, since as Petr points out there is some 
validity to the current behviour.

If not, a function and operator can almost certainly be created with 
this behaviour as an extension for those who really need it in 9.5. I'm 
sure Dmitry will be happy to work on that.

cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Sun, May 17, 2015 at 3:22 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> So what exactly do you want me or anybody else to do now, two days *after*
> we declared (not without pain) feature freeze?

As much as I'd like to just fix the concatenate operator, I really
don't want to be the person that adds additional delay to stabilizing
9.5.

> If there is a consensus that what you want is so important that we need to
> implement the new behaviour at this late stage, I'm happy to spend time on
> it if there's a patch forthcoming. I might add that this should be an
> additional behaviour, since as Petr points out there is some validity to the
> current behviour.

The current behavior does not seem acceptable for the concatenate
operator ("operator || jsonb"). If we can't fix it as the concatenate
operator, I think we should change it to be "operator + jsonb" or
something. It should be explained and understood as an operator that
(like the original existence operator "operator ? text", but unlike
the original containment operator "operator @> jsonb") exists only to
support the less common hstore-style use case. It should also not be
referred to as offering concatenation, but something more specialized
than that. I'm sorry, but as things stand I don't think that the
concatenation behavior makes sense as the general purpose jsonb
concatenate operator.

I'll go with the consensus, but obviously I feel pretty strongly that
we have the behavior of "operator || jsonb" wrong. We should at the
very least work towards a future version where there is a "operator ||
jsonb" that does the right thing. I don't even like the idea of having
this as an "operator + jsonb" operator, though, because it introduces
an operator that is concerned with the hstore-style use case that I
described before, and yet accepts a jsonb datum on the rhs.

Again, I'm sorry that I brought this up late, and I hope that this
doesn't seem capricious. I just happen to sincerely feel that the
current state of "operator || jsonb" leads us in the wrong long-term
direction. Let's hear what other people think, though.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Robert Haas
Date:
On May 17, 2015, at 8:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The current behavior does not seem acceptable for the concatenate
> operator ("operator || jsonb").

I don't agree.  It seems pretty clear to me after reading the new posts that the behavior is not an oversight, and
that'senough for me to say that we should leave this alone.  


Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Sun, May 17, 2015 at 5:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On May 17, 2015, at 8:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> The current behavior does not seem acceptable for the concatenate
>> operator ("operator || jsonb").
>
> I don't agree.  It seems pretty clear to me after reading the new posts that the behavior is not an oversight, and
that'senough for me to say that we should leave this alone.
 

I've said what I wanted to say. As long as the community is
comfortable with the reality that this concatenate operator really
isn't useful for assignment within UPDATEs for most jsonb users, then
I can leave it at that. I think that this concatenate operator may
have been informally promoted as the thing that made it possible to do
jsonb UPDATEs in a declarative fashion, but as things stand that
really isn't true.

If nothing changes, let's not make the mistake of going on to
*formally* promote this concatenate operator as offering the ability
to do jsonb UPDATEs in a declarative fashion, because that would be
quite misleading.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/17/2015 05:46 PM, Robert Haas wrote:
> On May 17, 2015, at 8:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> The current behavior does not seem acceptable for the concatenate
>> operator ("operator || jsonb").
> 
> I don't agree.  It seems pretty clear to me after reading the new posts that the behavior is not an oversight, and
that'senough for me to say that we should leave this alone. 
 

Is there a particular reason why "+" makes more sense as "shallow
concatination" and "||" makes more sense as "deep concatination"?  Like,
something in JS or other client languages which would make that
preference make more sense to users?

While I hate last-minute changes in general, once we have this
functionality as || we won't be able to swap operators later if we
decide that deep concatination should have been ||.  So I want to be
clear on why users will prefer that to + .

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Sun, May 17, 2015 at 8:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Is there a particular reason why "+" makes more sense as "shallow
> concatination" and "||" makes more sense as "deep concatination"?  Like,
> something in JS or other client languages which would make that
> preference make more sense to users?
>
> While I hate last-minute changes in general, once we have this
> functionality as || we won't be able to swap operators later if we
> decide that deep concatination should have been ||.  So I want to be
> clear on why users will prefer that to + .

This guy is talking about the concatenation operator in hstore:
https://twitter.com/toolmantim/status/589348855302344705

I don't want to present this operator as being the equivalent for
jsonb (the thing you use for assignment). I wish it was, but it just
isn't, as long as it treats the first nesting level as special. jsonb
is all about nesting in general, and its concatenate operator must
reflect this. It wouldn't be much use if "operator @> jsonb" didn't
care about nesting either, but it does (unlike hstore's equivalent,
because hstore doesn't nest).

I don't think that (say) an "operator + jsonb" ought to be called a
concatenation operator at all. The idea is to distance what we have
here from the idea of an hstore concatenate operator, and to encourage
the understanding that it has only a specialized use. I think that the
danger of someone making the wrong assumption about the new "operator
|| jsonb" is very real (I think that the reverse wouldn't be true,
though, if concatenation worked in a nested fashion -- that wouldn't
bother users that had non-nested jsonb documents).

As I said, I don't think that my preference for deep concatenation is
a matter of taste. I think that shallow concatenation is fundamentally
and objectively at odds with what jsonb is supposed to be (as long as
concatenation is the way "nested assignment" works, which is what
users have been taught to think).
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Oskari Saarenmaa
Date:
18.05.2015, 06:41, Josh Berkus kirjoitti:
> On 05/17/2015 05:46 PM, Robert Haas wrote:
>> On May 17, 2015, at 8:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> The current behavior does not seem acceptable for the concatenate
>>> operator ("operator || jsonb").
>>
>> I don't agree.  It seems pretty clear to me after reading the new posts that the behavior is not an oversight, and
that'senough for me to say that we should leave this alone. 
 
> 
> Is there a particular reason why "+" makes more sense as "shallow
> concatination" and "||" makes more sense as "deep concatination"?  Like,
> something in JS or other client languages which would make that
> preference make more sense to users?
> 
> While I hate last-minute changes in general, once we have this
> functionality as || we won't be able to swap operators later if we
> decide that deep concatination should have been ||.  So I want to be
> clear on why users will prefer that to + .

Both operations (shallow and deep merge) are useful and needed in many
applications but I've found the shallow merge to be more useful in the
"generic" use case; the deep merge implementations I've run across are
usually application specific as you need to decide what to do with
arrays, conflicting keys, etc.

I think concatenation is the right operator for shallow merge, it's
basically what would happen if you concatenated text representations of
two json objects and replaced the closing and opening braces between
them with a comma:

(substring(a::text, 1, length(a::text)-1) || ',' || substring(b::text,
2))::jsonb

Deep merge could be a function with flags to say what to do about
conflicts, etc.

/ Oskari



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 12:11 AM, Peter Geoghegan wrote:
> On Sun, May 17, 2015 at 8:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Is there a particular reason why "+" makes more sense as "shallow
>> concatination" and "||" makes more sense as "deep concatination"?  Like,
>> something in JS or other client languages which would make that
>> preference make more sense to users?
>>
>> While I hate last-minute changes in general, once we have this
>> functionality as || we won't be able to swap operators later if we
>> decide that deep concatination should have been ||.  So I want to be
>> clear on why users will prefer that to + .
> This guy is talking about the concatenation operator in hstore:
> https://twitter.com/toolmantim/status/589348855302344705
>
> I don't want to present this operator as being the equivalent for
> jsonb (the thing you use for assignment). I wish it was, but it just
> isn't, as long as it treats the first nesting level as special. jsonb
> is all about nesting in general, and its concatenate operator must
> reflect this. It wouldn't be much use if "operator @> jsonb" didn't
> care about nesting either, but it does (unlike hstore's equivalent,
> because hstore doesn't nest).
>
> I don't think that (say) an "operator + jsonb" ought to be called a
> concatenation operator at all. The idea is to distance what we have
> here from the idea of an hstore concatenate operator, and to encourage
> the understanding that it has only a specialized use. I think that the
> danger of someone making the wrong assumption about the new "operator
> || jsonb" is very real (I think that the reverse wouldn't be true,
> though, if concatenation worked in a nested fashion -- that wouldn't
> bother users that had non-nested jsonb documents).
>
> As I said, I don't think that my preference for deep concatenation is
> a matter of taste. I think that shallow concatenation is fundamentally
> and objectively at odds with what jsonb is supposed to be (as long as
> concatenation is the way "nested assignment" works, which is what
> users have been taught to think).


I'm not going to react to a Twitter conversation. I honestly think you 
are massively overstating your case here. And forget about hstore. It's 
really not relevant here. This code had its origin in NESTED hstore, 
which we don't have, but it's flown the nest.

Once again, nobody is arguing against deep of recursive concatenation, 
so you're really arguing against a straw man.  It just happens to be 
what we don't have.

As between || and + I'm personally moderately indifferent. I think 
you're representing some body of understanding about the effects of 
certain operators as being widespread when that's very far from clear.

You really still haven't said what you want us to do. Mostly you have 
expressed your unhappiness without suggesting a path forward. That's not 
helpful.

cheers

andrew





Re: jsonb concatenate operator's semantics seem questionable

From
Ryan Pedela
Date:
On Sun, May 17, 2015 at 9:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
Is there a particular reason why "+" makes more sense as "shallow
concatination" and "||" makes more sense as "deep concatination"?  Like,
something in JS or other client languages which would make that
preference make more sense to users?

As someone who uses JSON day-to-day in Javascript and Python, I personally don't think || or + matters much. Python uses json.loads() for JSON concat and you have use a 3rd-party library in Javascript if you want that functionality such as JQuery.extends(). I agree with Peter that we need deep concatenation, but I don't think there is any standard for the operator. I think the word "shallow" should be added to the docs though.

What is far more important than shallow or deep concatenation for the document database use case is being able to delete or replace/update a specific, nested path in the JSON object. It looks like that is possible with the minus operator and jsonb_replace(). This is great, however it took me awhile to figure out the path syntax. I think adding a paragraph to the docs explaining the path syntax would help.

Ryan Pedela

Re: jsonb concatenate operator's semantics seem questionable

From
Ryan Pedela
Date:
On Mon, May 18, 2015 at 8:41 AM, Ryan Pedela <rpedela@datalanche.com> wrote:
On Sun, May 17, 2015 at 9:41 PM, Josh Berkus <josh@agliodbs.com> wrote:

Is there a particular reason why "+" makes more sense as "shallow
concatination" and "||" makes more sense as "deep concatination"?  Like,
something in JS or other client languages which would make that
preference make more sense to users?

As someone who uses JSON day-to-day in Javascript and Python, I personally don't think || or + matters much. Python uses json.loads() for JSON concat and you have use a 3rd-party library in Javascript if you want that functionality such as JQuery.extends(). I agree with Peter that we need deep concatenation, but I don't think there is any standard for the operator. I think the word "shallow" should be added to the docs though.

What is far more important than shallow or deep concatenation for the document database use case is being able to delete or replace/update a specific, nested path in the JSON object. It looks like that is possible with the minus operator and jsonb_replace(). This is great, however it took me awhile to figure out the path syntax. I think adding a paragraph to the docs explaining the path syntax would help.

Having looked at this more, I think I understand the problem Peter has identified and it is a significant usability problem in my opinion. I think the word "concatenate" has confused me because I think of it as a higher-level operation when I want to merge two, large JSON objects which isn't a very common operation, at least for me. What is absolutely required for the document database use case is the following:

1. Get element at any arbitrary path. ( #> operator )
2. Delete any arbitrary path. ( minus operator )
3. Replace/update element at any arbitrary path. ( jsonb_replace )
4. Add element to any arbitrary path. ( ? )

It is #4 that does not seem to exist unless jsonb_replace() creates the specified path if it does not exist. Does it? I am not currently at my desk to test it myself.

If not, deep concatenation would solve this problem, but I can also see another solution. Use + for shallow concatenation since it really means "add element to top-level path" as Peter suggests. Then add another function: jsonb_add( target jsonb, path text[], new jsonb ) to add element at any arbitrary path. Then leave || for deep concatenation in 9.6 or whenever.

If jsonb_replace() satisfies #4 then I think everything is fine. Without #4 however, jsonb would remain an incomplete document database solution in my opinion.

Thanks,
Ryan Pedela

Re: jsonb concatenate operator's semantics seem questionable

From
Dmitry Dolgov
Date:
creates the specified path if it does not exist. Does it?

No, jsonb_replace() doesn't create an element, if it doesn't exist. I think, otherwise it can be confusing, so probably jsonb_add() may be more appropriate (and, actually, this function was already mentioned in previous discussions).

On 18 May 2015 at 22:57, Ryan Pedela <rpedela@datalanche.com> wrote:
On Mon, May 18, 2015 at 8:41 AM, Ryan Pedela <rpedela@datalanche.com> wrote:
On Sun, May 17, 2015 at 9:41 PM, Josh Berkus <josh@agliodbs.com> wrote:

Is there a particular reason why "+" makes more sense as "shallow
concatination" and "||" makes more sense as "deep concatination"?  Like,
something in JS or other client languages which would make that
preference make more sense to users?

As someone who uses JSON day-to-day in Javascript and Python, I personally don't think || or + matters much. Python uses json.loads() for JSON concat and you have use a 3rd-party library in Javascript if you want that functionality such as JQuery.extends(). I agree with Peter that we need deep concatenation, but I don't think there is any standard for the operator. I think the word "shallow" should be added to the docs though.

What is far more important than shallow or deep concatenation for the document database use case is being able to delete or replace/update a specific, nested path in the JSON object. It looks like that is possible with the minus operator and jsonb_replace(). This is great, however it took me awhile to figure out the path syntax. I think adding a paragraph to the docs explaining the path syntax would help.

Having looked at this more, I think I understand the problem Peter has identified and it is a significant usability problem in my opinion. I think the word "concatenate" has confused me because I think of it as a higher-level operation when I want to merge two, large JSON objects which isn't a very common operation, at least for me. What is absolutely required for the document database use case is the following:

1. Get element at any arbitrary path. ( #> operator )
2. Delete any arbitrary path. ( minus operator )
3. Replace/update element at any arbitrary path. ( jsonb_replace )
4. Add element to any arbitrary path. ( ? )

It is #4 that does not seem to exist unless jsonb_replace() creates the specified path if it does not exist. Does it? I am not currently at my desk to test it myself.

If not, deep concatenation would solve this problem, but I can also see another solution. Use + for shallow concatenation since it really means "add element to top-level path" as Peter suggests. Then add another function: jsonb_add( target jsonb, path text[], new jsonb ) to add element at any arbitrary path. Then leave || for deep concatenation in 9.6 or whenever.

If jsonb_replace() satisfies #4 then I think everything is fine. Without #4 however, jsonb would remain an incomplete document database solution in my opinion.

Thanks,
Ryan Pedela


Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 5:05 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> As between || and + I'm personally moderately indifferent. I think you're
> representing some body of understanding about the effects of certain
> operators as being widespread when that's very far from clear.
>
> You really still haven't said what you want us to do. Mostly you have
> expressed your unhappiness without suggesting a path forward. That's not
> helpful.

I think I was very clear on what I see as a path forward. The
documentation and general presentation of the operator (e.g. advocacy
stuff) needs to make clear that it isn't suitable for "nested
assignment" in the same way that the hstore concatenate operator is. I
think it has been understood in this way by people so far. This is my
major concern.

So I think we should use the + operator to distance this from the
hstore concatenate operator, which *is* widely understood to be mostly
useful for "nested assignment". And I think that the documentation
should be updated to make the actual uses of the operator very clear.
Granted, the documentation for this new operator does not say anything
about "nested assignment" for the new concatenate operator as things
stand, but it says very little.

When I saw people getting excited about this new feature, I assumed
that it addressed what some people saw as the major omission in jsonb.
Unfortunately, it doesn't, and it behooves us to make sure that that
is understood. My complaint is mostly about the *positioning* of the
operator.
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 01:18 PM, Peter Geoghegan wrote:

> So I think we should use the + operator to distance this from the
> hstore concatenate operator, which *is* widely understood to be mostly
> useful for "nested assignment".


Having trouble scanning this. Since hstore isn't nested what the heck 
does "nested assignment" mean w.r.t. hstore?

In any case, it seems clear from other posters that your perceptions and 
expectations are not universally held.


cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 10:29 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Having trouble scanning this. Since hstore isn't nested what the heck does
> "nested assignment" mean w.r.t. hstore?

It means assigning to one "subdatum" in the hstore datum, as opposed
to simply assigning an entirely new hstore (granted, that's what
you're actually doing with the nested assignment, but conceptually
you're assigning to a "subdatum" when using the "hstore ||" idiom that
the hstore docs prominently talk about).

> In any case, it seems clear from other posters that your perceptions and
> expectations are not universally held.

I think that my expectations are pretty widely held. People want to be
able to assign to nested elements of a jsonb. Unfortunately, this
doesn't get them any closer, because many layers of nesting are normal
with jsonb. Let's be clear on that.
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 01:43 PM, Peter Geoghegan wrote:
> On Mon, May 18, 2015 at 10:29 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Having trouble scanning this. Since hstore isn't nested what the heck does
>> "nested assignment" mean w.r.t. hstore?
> It means assigning to one "subdatum" in the hstore datum, as opposed
> to simply assigning an entirely new hstore (granted, that's what
> you're actually doing with the nested assignment, but conceptually
> you're assigning to a "subdatum" when using the "hstore ||" idiom that
> the hstore docs prominently talk about).


I could argue at least as convincingly that what the jsonb || operator 
does is exactly analogous to what hstore's || does.


cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/17/2015 09:11 PM, Peter Geoghegan wrote:> As I said, I don't think
that my preference for deep concatenation is
> a matter of taste. I think that shallow concatenation is fundamentally
> and objectively at odds with what jsonb is supposed to be (as long as
> concatenation is the way "nested assignment" works, which is what
> users have been taught to think).

That was a really wordy way to not answer my question.  Everyone wants
deep append.  We don't have it.  Unless you have a finished, tested
patch you're sitting on, that discussion isn't relevant with the sole
exception of whether we want to reserve "||" for it.

On 05/18/2015 08:57 AM, Ryan Pedela wrote:
> If not, deep concatenation would solve this problem, but I can also see
> another solution. Use + for shallow concatenation since it really means
> "add element to top-level path" as Peter suggests. Then add another
> function: jsonb_add( target jsonb, path text[], new jsonb ) to add
> element at any arbitrary path. Then leave || for deep concatenation in
> 9.6 or whenever.

Since swapping the operator seems still on the table, is there any
particular reason why you think "+" is more suited to shallow
concatination?  Both you and Peter have said this, but as a heavy user
of JSON/JSONB, to me it seems the other way around.  That is, "+" says
"add to arbitrary nested node" to me more than "||" does.

> If jsonb_replace() satisfies #4 then I think everything is fine. Without
> #4 however, jsonb would remain an incomplete document database solution
> in my opinion.

Oh, no question, we're still incomplete.  Aside from nested append, we
kinda lack easy sharded scale-out, which is a rather more major feature, no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 11:16 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> I could argue at least as convincingly that what the jsonb || operator does
> is exactly analogous to what hstore's || does.

Again, my concern is not primarily a theoretical one. It's primarily a
practical concern. We are no closer to supporting UPDATEs with jsonb
in an easy, declarative way, unfortunately. That's what I thought that
this was supposed to do, and despite your protestations, if I picked
up that idea other people easily could.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 11:24 AM, Josh Berkus <josh@agliodbs.com> wrote:
> On 05/17/2015 09:11 PM, Peter Geoghegan wrote:> As I said, I don't think
> that my preference for deep concatenation is
>> a matter of taste. I think that shallow concatenation is fundamentally
>> and objectively at odds with what jsonb is supposed to be (as long as
>> concatenation is the way "nested assignment" works, which is what
>> users have been taught to think).
>
> That was a really wordy way to not answer my question.  Everyone wants
> deep append.  We don't have it.  Unless you have a finished, tested
> patch you're sitting on, that discussion isn't relevant with the sole
> exception of whether we want to reserve "||" for it.

The point of not using "||" is to distance this from the update hstore
idiom. It won't work here in the way I expected.

> On 05/18/2015 08:57 AM, Ryan Pedela wrote:
>> If not, deep concatenation would solve this problem, but I can also see
>> another solution. Use + for shallow concatenation since it really means
>> "add element to top-level path" as Peter suggests. Then add another
>> function: jsonb_add( target jsonb, path text[], new jsonb ) to add
>> element at any arbitrary path. Then leave || for deep concatenation in
>> 9.6 or whenever.
>
> Since swapping the operator seems still on the table, is there any
> particular reason why you think "+" is more suited to shallow
> concatination?  Both you and Peter have said this, but as a heavy user
> of JSON/JSONB, to me it seems the other way around.  That is, "+" says
> "add to arbitrary nested node" to me more than "||" does.

I'm not necessarily attached to "+". I just want to make this
different to hstore's "||" operator. There should be a similar idiom
with jsonb, but that can come later.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/18/2015 11:34 AM, Peter Geoghegan wrote:
> I'm not necessarily attached to "+". I just want to make this
> different to hstore's "||" operator. There should be a similar idiom
> with jsonb, but that can come later.

This argument still makes no sense to me.  Hstore is not nested. If
anything, the behavior of || in hstore suggests that we *should* use ||
for shallow append in JSONB.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 11:45 AM, Josh Berkus <josh@agliodbs.com> wrote:
> On 05/18/2015 11:34 AM, Peter Geoghegan wrote:
>> I'm not necessarily attached to "+". I just want to make this
>> different to hstore's "||" operator. There should be a similar idiom
>> with jsonb, but that can come later.
>
> This argument still makes no sense to me.  Hstore is not nested. If
> anything, the behavior of || in hstore suggests that we *should* use ||
> for shallow append in JSONB.

Should the "@>" operator be non-nested, as in hstore, too?

As you say, hstore isn't nested, and so this simply doesn't come up
there. We have failed to adopt "||" to jsonb in a way that makes
sense. We should have adopted it to jsonb in exactly the same way as
the "@>" operator was.
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 02:45 PM, Josh Berkus wrote:
> On 05/18/2015 11:34 AM, Peter Geoghegan wrote:
>> I'm not necessarily attached to "+". I just want to make this
>> different to hstore's "||" operator. There should be a similar idiom
>> with jsonb, but that can come later.
> This argument still makes no sense to me.  Hstore is not nested. If
> anything, the behavior of || in hstore suggests that we *should* use ||
> for shallow append in JSONB.
>

Yeah, I agree. To me "concatenation" seems a wrong designation of what 
is effectively a recursive merge.

I have no objection to us making much clearer in the docs exactly what's 
happening with ||.

cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/18/2015 11:58 AM, Peter Geoghegan wrote:
> As you say, hstore isn't nested, and so this simply doesn't come up
> there. We have failed to adopt "||" to jsonb in a way that makes
> sense. We should have adopted it to jsonb in exactly the same way as
> the "@>" operator was.

OK, you've flagellated this deceased equine enough that I'm calling the
ASPCA.  I get that you're unhappy that we don't have deep append.
Everyone gets this.  I simply don't care; shallow append is better than
no append at all, and having shallow append does not block deep append
from happening in 9.6.

The only question worth discussing is whether we change the operator to
"+" (or, for that matter, something else).  I've seen your vote on this,
so, does anyone else have an opinion on "+" vs. "||"?  Preferably with a
justification with some kind of grounding?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 12:12 PM, Josh Berkus <josh@agliodbs.com> wrote:
> OK, you've flagellated this deceased equine enough that I'm calling the
> ASPCA.  I get that you're unhappy that we don't have deep append.
> Everyone gets this.  I simply don't care; shallow append is better than
> no append at all, and having shallow append does not block deep append
> from happening in 9.6.

I never expressed disagreement with having shallow append.

> The only question worth discussing is whether we change the operator to
> "+" (or, for that matter, something else).  I've seen your vote on this,
> so, does anyone else have an opinion on "+" vs. "||"?  Preferably with a
> justification with some kind of grounding?

My argument has very good grounding. The "||" UPDATE idiom from hstore
does not and cannot work in a practical way with jsonb's current ||
operator (at least for the large majority of use cases). It could, it
just doesn't. I don't want users to make the same association that I
did, which, based on the total lack of documentation for the new
operator, they easily could.

What is hard to understand about that?

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 03:21 PM, Peter Geoghegan wrote:


>> The only question worth discussing is whether we change the operator to
>> "+" (or, for that matter, something else).  I've seen your vote on this,
>> so, does anyone else have an opinion on "+" vs. "||"?  Preferably with a
>> justification with some kind of grounding?
> My argument has very good grounding. The "||" UPDATE idiom from hstore
> does not and cannot work in a practical way with jsonb's current ||
> operator (at least for the large majority of use cases). It could, it
> just doesn't. I don't want users to make the same association that I
> did, which, based on the total lack of documentation for the new
> operator, they easily could.
>


So you're arguing that we shouldn't call the operation in question || 
because it's pretty much the same, mutatis mutandis, as the hstore 
operation of the same name. You've lost me.

Right now I'm more convinced that I was at the start of this discussion 
that we have the name right, and when we get a recursive merge operation 
calling it something else will be right too.

I do agree that we need to be clearer in the docs about what the 
operation is and is not.

cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Robert Haas
Date:
On Mon, May 18, 2015 at 3:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
> What is hard to understand about that?

What is hard to understand is why you're going on and on about what is
basically a matter of opinion after several people have said they
don't agree with your opinion.

Realistically, as much as we might try to fool ourselves into
believing otherwise, operators are not self-documenting, except for
the ones you knew by the fourth grade.  People will have to read the
documentation no matter what we do here.  If the documentation is good
enough, let's improve it.  But let's not fool ourselves into believing
that calling the operator || or + or (in the spirit of this
discussion) ?!?!?!? will make it clearer.  If we wanted to make it
clearer, we'd have made it a function instead of an operator.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: jsonb concatenate operator's semantics seem questionable

From
Ryan Pedela
Date:
On Mon, May 18, 2015 at 12:24 PM, Josh Berkus <josh@agliodbs.com> wrote:
On 05/18/2015 08:57 AM, Ryan Pedela wrote:
> If not, deep concatenation would solve this problem, but I can also see
> another solution. Use + for shallow concatenation since it really means
> "add element to top-level path" as Peter suggests. Then add another
> function: jsonb_add( target jsonb, path text[], new jsonb ) to add
> element at any arbitrary path. Then leave || for deep concatenation in
> 9.6 or whenever.

Since swapping the operator seems still on the table, is there any
particular reason why you think "+" is more suited to shallow
concatination?  Both you and Peter have said this, but as a heavy user
of JSON/JSONB, to me it seems the other way around.  That is, "+" says
"add to arbitrary nested node" to me more than "||" does.

Let me back up a little. I always like to think about what is the ideal interface first and then worry about implementation because implementation can always be changed but interface can't. I think the current concat/merge interface is the ideal. It should be || because that means concat/merge everywhere else in the PG interface that I am aware of. In the case of JSON which is a hierarchically data structure, it should be implemented as a deep merge which by definition satisfies a shallow merge. This is what I would expect as a user and I would think there was a bug if it didn't perform deep merge. I expect this because I can implement shallow merge easily myself using Javascript, Python, etc but deep merge is non-trivial. Therefore I would expect a special JSON concat/merge library function to do deep merge. I would rather the interface stay the same and it documented that the current implementation is a shallow merge and may become a deep merge in the future.

In the context of splitting shallow and deep merge into two operators, I think + is better for shallow and || better for deep. The reason for + is because many programming languages have this behavior. If I see the below code in language I have never used before:

objC = objA + objB

My default assumption is that + performs a shallow merge. Like I said, I would rather there just be one operator.
 
> If jsonb_replace() satisfies #4 then I think everything is fine. Without
> #4 however, jsonb would remain an incomplete document database solution
> in my opinion.

Oh, no question, we're still incomplete.  Aside from nested append, we
kinda lack easy sharded scale-out, which is a rather more major feature, no?

I think it depends on the point of view which is more important. If you have a massive dataset, then obviously sharding is more important. But my own take on why NoSQL became so popular has only a little to do with sharding. MongoDB pitched to tech entrepreneurs "use our database and implement your MVP 10x faster/easier and we have sharding when you become the next Google".  And it worked brilliantly. Many tech entrepreneurs are worried about time constraints and dream of becoming the next Google (myself included). But the reality is that most fail and the majority who don't fail achieve moderate success, only a handful reach Google-level success. Therefore the vast majority end up never needing sharding, but they all experience that advertised 10x development speed improvement. I doubt it really is 10x, but JSON maps very well to programming language data structures (no impedence mismatch) so it is usually faster to build prototypes with MongoDB.

If jsonb supported nested append, then I think that would be enough for people who care most about development speed which I think is a larger group than the group with massive datasets. In addition, sharding seems like a server-level or database-level issue rather than a data type issue.

Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/18/2015 01:04 PM, Ryan Pedela wrote:
> Let me back up a little. I always like to think about what is the ideal
> interface first and then worry about implementation because
> implementation can always be changed but interface can't. I think the
> current concat/merge interface is the ideal. It should be || because
> that means concat/merge everywhere else in the PG interface that I am
> aware of. In the case of JSON which is a hierarchically data structure,
> it should be implemented as a deep merge which by definition satisfies a
> shallow merge. This is what I would expect as a user and I would think
> there was a bug if it didn't perform deep merge. I expect this because I
> can implement shallow merge easily myself using Javascript, Python, etc
> but deep merge is non-trivial. Therefore I would expect a special JSON
> concat/merge library function to do deep merge. I would rather the
> interface stay the same and it documented that the current
> implementation is a shallow merge and may become a deep merge in the future.
> 
> In the context of splitting shallow and deep merge into two operators, I
> think + is better for shallow and || better for deep. The reason for +
> is because many programming languages have this behavior. If I see the
> below code in language I have never used before:
> 
> objC = objA + objB
> 
> My default assumption is that + performs a shallow merge. Like I said, I
> would rather there just be one operator.

Thank you, that helps.  Anyone else?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Marko Tiikkaja
Date:
On 2015-05-18 22:10, Josh Berkus wrote:
> On 05/18/2015 01:04 PM, Ryan Pedela wrote:
>> In the context of splitting shallow and deep merge into two operators, I
>> think + is better for shallow and || better for deep. The reason for +
>> is because many programming languages have this behavior. If I see the
>> below code in language I have never used before:
>>
>> objC = objA + objB
>>
>> My default assumption is that + performs a shallow merge. Like I said, I
>> would rather there just be one operator.
>
> Thank you, that helps.  Anyone else?

If everyone thinks the operators mean different things, we could just 
not add any operators and only provide functions instead.


.m



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 1:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Realistically, as much as we might try to fool ourselves into
> believing otherwise, operators are not self-documenting, except for
> the ones you knew by the fourth grade.  People will have to read the
> documentation no matter what we do here.  If the documentation is good
> enough, let's improve it.  But let's not fool ourselves into believing
> that calling the operator || or + or (in the spirit of this
> discussion) ?!?!?!? will make it clearer.  If we wanted to make it
> clearer, we'd have made it a function instead of an operator.

jsonb is based on hstore. I don't want the idiomatic way to do updates
with hstore not work in a similar fashion with jsonb. Granted, if
you're not reading the documentation you may well have other problems,
but I think we should do our best to avoid confusion. I was quite
surprised when I saw that the concatenate operator isn't really useful
for updates. If it wasn't for hstore, then I might agree with Andrew,
or at least not care too much one way or the other. Besides, my
opinion on which operator is at least closer to being self-documenting
is shared by Ryan.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Catalin Iacob
Date:
On Mon, May 18, 2015 at 9:03 PM Andrew Dunstan <andrew@dunslane.net> wrote:
So you're arguing that we shouldn't call the operation in question ||
because it's pretty much the same, mutatis mutandis, as the hstore
operation of the same name. You've lost me.

Hopefully this helps. Peter's argument, as I understand it is:

In hstore @> means unnested containment, in jsonb it means nested containment. Therefore, when an hstore operator is applied to jsonb it gets "nestedness" as jsonb is nested and adds that nestedness is an important thing that sets it apart from hstore. Therefore, since || is unnested concatenation in hstore, it should become nested concatenation for jsonb.

I don't know if the argument is strong enough but it does make some sense.

Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Mon, May 18, 2015 at 1:19 PM, Catalin Iacob <iacobcatalin@gmail.com> wrote:
> In hstore @> means unnested containment, in jsonb it means nested
> containment. Therefore, when an hstore operator is applied to jsonb it gets
> "nestedness" as jsonb is nested and adds that nestedness is an important
> thing that sets it apart from hstore. Therefore, since || is unnested
> concatenation in hstore, it should become nested concatenation for jsonb.
>
> I don't know if the argument is strong enough but it does make some sense.

Right. And I'm not saying that there isn't a place for shallow
containment, or whatever you want to call it. But I prefer to do that
with an operator that people haven't been taught to think of as "the
update operator" from hstore.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
"David G. Johnston"
Date:
On Mon, May 18, 2015 at 12:12 PM, Josh Berkus <josh@agliodbs.com> wrote:
On 05/18/2015 11:58 AM, Peter Geoghegan wrote:
> As you say, hstore isn't nested, and so this simply doesn't come up
> there. We have failed to adopt "||" to jsonb in a way that makes
> sense. We should have adopted it to jsonb in exactly the same way as
> the "@>" operator was.


The only question worth discussing is whether we change the operator to
"+" (or, for that matter, something else).  I've seen your vote on this,
so, does anyone else have an opinion on "+" vs. "||"?  Preferably with a
justification with some kind of grounding?

How about a pair of operators? 

​jsonb |> jsonb​ "push these keys into that jsonb, replacing existing values for any keys already present"
​jsonb <| jsonb​ "pull those keys into this jsonb, [...]"

I do suspect, however, that any kind of deep concatenation/replacement algorithm is going to require end-user input and thus will want to be done strictly through functions as opposed to operators.

Given the complexity of json I'm not convinced that either + or || should end up being implemented.  Those are operators best left to simple types where the meaning of "add" and "concatenate" are well defined.  Even if we do end up with the deep "concatenation" algorithm, and decide to turn it into an operator, at this moment I would not choose || to be that operator.

This entire thread is the justification for the last paragraph...unfortunately the rest is just my opinion.

David J.

Re: jsonb concatenate operator's semantics seem questionable

From
"Ilya I. Ashchepkov"
Date:
Hello!

[3] First of all few words about concatenation of jsonb values in my mind.
Jsonb values concatenation result may follow this rules:
1) array of both values if both are scalars
2) concatenated array if both are arrays
3) prepended or appended array if only one is array
4) recursive concatenated jsonb-object if both is objects

[4] When we are merging two jsonb-objects and we have a key in both we can:
1) use value from last jsonb-object
2) concatenate this values using rules from [3]

[5] Also I want to be able to do the following:
# select JSONB_MERGE_FUN(SMTHNG, '{"a":{"b":1,"c":1}}'::jsonb,
'{"a":{"b":2,"c":2}}'::jsonb);?column?
-------------------------{"a":{"b":[1,2],"c":2}}

I'm thinking about SMTHNG as a function, that receive:
1) array of keys - where merging "cursor" is in jsonb objects
2) types of jsonb objects at this cursor (null/scalar/array/object)
Returned value determinates behaviour that will be used to apply rules from [4].

Current function implements the following algorithm (for objects):
1) use value from last object if keys array length is not 0

My implementation:
1) use value from last object if type is scalar or array
2) concatenate values if both are objects

At this moment I know how implement something like function from [5] on C,
but I think it will be not very useful for developers.
I think it will be useful if it may be implemented like aggregate.

Any thoughts?


On Sun, May 17, 2015 at 2:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
> Another thing that I noticed about the new jsonb stuff is that the
> concatenate operator is based on the hstore one. This works as
> expected:
>
> postgres=# select '{"a":1}'::jsonb || '{"a":2}';
> ?column?
> ----------
>  {"a": 2}
> (1 row)
>
> However, the nesting doesn't "match up" -- containers are not merged
> beyond the least-nested level:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also nested":2}}';
>          ?column?
> ---------------------------
>  {"a": {"also nested": 2}}
> (1 row)
>
> This feels wrong to me. When jsonb was initially introduced, we took
> inspiration for the *containment* ("operator @> jsonb") semantics from
> hstore, but since jsonb is nested it worked in a nested fashion. At
> the top level and with no nested containers there was no real
> difference, but we had to consider the behavior of more nested levels
> carefully (the containment operator is clearly the most important
> jsonb operator). I had envisaged that with the concatenation of jsonb,
> concatenation would similarly behave in a nested fashion. Under this
> scheme, the above query would perform nested concatenation as follows:
>
> postgres=# select '{"a":{"nested":1}}'::jsonb || '{"a":{"also
> nested":2}}'; -- does not match actual current behavior
>          ?column?
> ---------------------------
>  {"a": {"nested":1, "also nested": 2}}
> (1 row)
>
> Now, I think it's good that the minus operator ("operator - text" and
> friends) discussed on the nearby thread accepts a text (or int)
> argument and remove string elements/pairs at the top level only. This
> works exactly the same as existence (I happen to think that removing
> elements/pairs at a nested level is likely to be more trouble than
> it's worth, and so I don't really like the new "jsonb - text[]"
> operator much, because it accepts a Postgres (not JSON) array of texts
> that constitute a path, which feels odd). So I have no issue with at
> least the plain minus operators' semantics. But I think that the
> concatenate operator's current semantics are significantly less useful
> than they could be, and are not consistent with the overall design of
> jsonb.
>
> I'm particularly concerned about a table containing many homogeneously
> structured, deeply nested jsonb datums (think of the delicious URLs
> dataset that jsonb was originally tested using for a good example of
> that -- this is quite representative of how people use jsonb in the
> real world). It would be almost impossible to perform insert-or-update
> type operations to these deeply nested elements using hstore style
> concatenation. You'd almost invariably end up removing a bunch of
> irrelevant nested values of the documents, when you only intended to
> update one deeply nested value.
>
> Looking back at the discussion of the new jsonb stuff, a concern was
> raised along these lines by Ilya Ashchepkov [1], but this was
> dismissed. I feel pretty strongly that this should be revisited. I'm
> willing to concede that we might not want to always merge containers
> that are found in the same position during concatenation, but I think
> it's more likely that we do. As with containment, my sense is that
> there should be nothing special about the nesting level -- it should
> not influence whether we merge rather than overwrite the operator's
> lhs container (with or into the rhs container). Not everyone will
> agree with this [2].
>
> I'm sorry that I didn't get to this sooner, but I was rather busy when
> it was being discussed.
>
> [1] http://www.postgresql.org/message-id/55006879.2050601@dunslane.net
> [2] http://www.postgresql.org/message-id/54EF61DD.7040208@agliodbs.com
> --
> Peter Geoghegan



--
С уважением,
Ащепков Илья koctep@gmail.com



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/18/2015 04:54 PM, David G. Johnston wrote:
> On Mon, May 18, 2015 at 12:12 PM, Josh Berkus <josh@agliodbs.com
> <mailto:josh@agliodbs.com>>wrote:
>
>     On 05/18/2015 11:58 AM, Peter Geoghegan wrote:
>     > As you say, hstore isn't nested, and so this simply doesn't come up
>     > there. We have failed to adopt "||" to jsonb in a way that makes
>     > sense. We should have adopted it to jsonb in exactly the same way as
>     > the "@>" operator was.
>
>     The only question worth discussing is whether we change the
>     operator to
>     "+" (or, for that matter, something else).  I've seen your vote on
>     this,
>     so, does anyone else have an opinion on "+" vs. "||"? Preferably
>     with a
>     justification with some kind of grounding?
>
>
> How about a pair of operators?
>
> ​jsonb |> jsonb​ "push these keys into that jsonb, replacing existing
> values for any keys already present"
> ​jsonb <| jsonb​ "pull those keys into this jsonb, [...]"
>
> I do suspect, however, that any kind of deep concatenation/replacement
> algorithm is going to require end-user input and thus will want to be
> done strictly through functions as opposed to operators.
>
> Given the complexity of json I'm not convinced that either + or ||
> should end up being implemented.  Those are operators best left to
> simple types where the meaning of "add" and "concatenate" are well
> defined.  Even if we do end up with the deep "concatenation"
> algorithm, and decide to turn it into an operator, at this moment I
> would not choose || to be that operator.
>
> This entire thread is the justification for the last
> paragraph...unfortunately the rest is just my opinion.
>
>

I agree that we might need more than one merge operator, or possibly
none and a merge function with options. But I haven't yet seen an
argument that convinces me we need to rename the operation we do have.

cheers

andrew




Re: jsonb concatenate operator's semantics seem questionable

From
Jim Nasby
Date:
On 5/18/15 3:15 PM, Marko Tiikkaja wrote:
> On 2015-05-18 22:10, Josh Berkus wrote:
>> On 05/18/2015 01:04 PM, Ryan Pedela wrote:
>>> In the context of splitting shallow and deep merge into two operators, I
>>> think + is better for shallow and || better for deep. The reason for +
>>> is because many programming languages have this behavior. If I see the
>>> below code in language I have never used before:
>>>
>>> objC = objA + objB
>>>
>>> My default assumption is that + performs a shallow merge. Like I said, I
>>> would rather there just be one operator.
>>
>> Thank you, that helps.  Anyone else?
>
> If everyone thinks the operators mean different things, we could just
> not add any operators and only provide functions instead.

My $0.02: I would expect || to be what I want to use to add something to 
an existing JSON document, no matter what the path of what I'm adding 
is. In other words, deep merge. I certainly wouldn't expect it to be 
shallow.

If we get this wrong now, we'll be stuck with it forever. At a minimum I 
think we should use anything other than || until we can figure this out. 
That leaves || available for whichever case we decide on.

BTW, if people are set on shallow merge being || then I'd suggest ||| as 
the deep merge operator.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: jsonb concatenate operator's semantics seem questionable

From
Petr Jelinek
Date:
On 20/05/15 01:38, Jim Nasby wrote:
> On 5/18/15 3:15 PM, Marko Tiikkaja wrote:
>> On 2015-05-18 22:10, Josh Berkus wrote:
>>> On 05/18/2015 01:04 PM, Ryan Pedela wrote:
>>>> In the context of splitting shallow and deep merge into two
>>>> operators, I
>>>> think + is better for shallow and || better for deep. The reason for +
>>>> is because many programming languages have this behavior. If I see the
>>>> below code in language I have never used before:
>>>>
>>>> objC = objA + objB
>>>>
>>>> My default assumption is that + performs a shallow merge. Like I
>>>> said, I
>>>> would rather there just be one operator.
>>>
>>> Thank you, that helps.  Anyone else?
>>
>> If everyone thinks the operators mean different things, we could just
>> not add any operators and only provide functions instead.
>
> My $0.02: I would expect || to be what I want to use to add something to
> an existing JSON document, no matter what the path of what I'm adding
> is. In other words, deep merge. I certainly wouldn't expect it to be
> shallow.
>
> If we get this wrong now, we'll be stuck with it forever. At a minimum I
> think we should use anything other than || until we can figure this out.
> That leaves || available for whichever case we decide on.
>

I am of strong opinion that concat should be shallow by default. Again 
it's how jquery works by default, it's how python's dict.update works 
and you can find this behavior in other languages as well when dealing 
with nested hashes. It's also how json would behave if you'd just did 
string concatenation (removing the outermost curly brackets) and parse 
it to json afterwards.

I think this whole discussion shows primarily that it's by far not 
universally agreed if concatenation of json should be shallow or deep by 
default and AFAICS this is true even in javascript world so we don't 
really have where to look for precedents.

Given the above I would vote to just provide the function and leave out 
the || operator for now.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Tue, May 19, 2015 at 10:43 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> I am of strong opinion that concat should be shallow by default. Again it's
> how jquery works by default, it's how python's dict.update works and you can
> find this behavior in other languages as well when dealing with nested
> hashes. It's also how json would behave if you'd just did string
> concatenation (removing the outermost curly brackets) and parse it to json
> afterwards.

As I said, that argument might be a good one if you were able to
subscript jsonb and have the update affect one particular subdocument.
You're not, though -- updating jsonb usually requires you to write an
SQL expression that evaluates to the final jsonb document that you'd
like to update a record to contain.

> I think this whole discussion shows primarily that it's by far not
> universally agreed if concatenation of json should be shallow or deep by
> default and AFAICS this is true even in javascript world so we don't really
> have where to look for precedents.
>
> Given the above I would vote to just provide the function and leave out the
> || operator for now.

I've said my piece; I think it's a mistake to use an operator that has
a certain association, the association that the concatenate operate
got from hstore. || is the operator broadly useful for updates in
people's minds. I think this *positioning* of the operator is a
mistake. I'll leave it at that.

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/20/2015 02:11 AM, Peter Geoghegan wrote:
> On Tue, May 19, 2015 at 10:43 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
>> I am of strong opinion that concat should be shallow by default. Again it's
>> how jquery works by default, it's how python's dict.update works and you can
>> find this behavior in other languages as well when dealing with nested
>> hashes. It's also how json would behave if you'd just did string
>> concatenation (removing the outermost curly brackets) and parse it to json
>> afterwards.
> As I said, that argument might be a good one if you were able to
> subscript jsonb and have the update affect one particular subdocument.
> You're not, though -- updating jsonb usually requires you to write an
> SQL expression that evaluates to the final jsonb document that you'd
> like to update a record to contain.
>
>> I think this whole discussion shows primarily that it's by far not
>> universally agreed if concatenation of json should be shallow or deep by
>> default and AFAICS this is true even in javascript world so we don't really
>> have where to look for precedents.
>>
>> Given the above I would vote to just provide the function and leave out the
>> || operator for now.
> I've said my piece; I think it's a mistake to use an operator that has
> a certain association, the association that the concatenate operate
> got from hstore. || is the operator broadly useful for updates in
> people's minds. I think this *positioning* of the operator is a
> mistake. I'll leave it at that.
>


OK, I'm going to suggest a way out of this. ISTM the real trouble is 
that you're wanting to shoehorn a meaning onto || which many people 
don't think it should have. || doesn't mean "update" to me, it means 
"concatenate", which in the json context means
    '{ items1}' || '{items2}' = '{items1, items2}'

That's 100% consistent not only with hstore but with the use of this 
operator for strings and arrays. The fact that it's used as the way to 
update hstore is a byproduct of the way hstore works rather than a 
fundamental part of the meaning of ||. If hstore's rule were "first one 
wins" instead of "last one wins" we'd have to use something else.

But leaving that aside, your real gripe is that we don't currently have 
any way of adding a value somewhere nested inside json.

So Dmitry, at my suggestion, has come up with a way of doing that, by 
adding a parameter to jsonb_replace(). If this parameter is set to true 
(it defaults to false) and the key or array element pointed to by the 
last element of the path doesn't exist, it gets created.

Examples:
   andrew=# select   jsonb_replace('{"a":1,"b":[0,1,2],"c":{"d":4}}','{c,e}','{"foo":"bar"}',   true);
         jsonb_replace   --------------------------------------------------------------     {"a": 1, "b": [0, 1, 2],
"c":{"d": 4, "e": {"foo": "bar"}}}   (1 row)
 
   andrew=# select   jsonb_replace('{"a":1,"b":[0,1,2],"c":{"d":4}}','{c,e}','{"foo":"bar"}',   false);
jsonb_replace   -----------------------------------------     {"a": 1, "b": [0, 1, 2], "c": {"d": 4}}   (1 row)
 
   andrew=# select   jsonb_replace('{"a":1,"b":[0,1,2],"c":{"d":4}}','{c,e}','{"foo":"bar"}');
jsonb_replace  -----------------------------------------     {"a": 1, "b": [0, 1, 2], "c": {"d": 4}}   (1 row)
 


This seems to me a much more straightforward way of adding a value 
inside a jsonb than any operator can offer.

This is actually a tiny change - less than 200 lines - and given the 
evident angst over this issue, I'm prepared to incorporate it. I'm still 
working on the array piece, will have it done later today, but the 
object field piece just works.

If we do we might want to reconsider the name of jsonb_replace - maybe 
call it jsonb_set.

So, can we bend the rules just a tad to do this and (I hope) make a lot 
of people a lot happier?

cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Alvaro Herrera
Date:
Petr Jelinek wrote:
> On 20/05/15 01:38, Jim Nasby wrote:

> >If we get this wrong now, we'll be stuck with it forever. At a minimum I
> >think we should use anything other than || until we can figure this out.
> >That leaves || available for whichever case we decide on.
> 
> I am of strong opinion that concat should be shallow by default. Again it's
> how jquery works by default, it's how python's dict.update works and you can
> find this behavior in other languages as well when dealing with nested
> hashes. It's also how json would behave if you'd just did string
> concatenation (removing the outermost curly brackets) and parse it to json
> afterwards.
> 
> I think this whole discussion shows primarily that it's by far not
> universally agreed if concatenation of json should be shallow or deep by
> default and AFAICS this is true even in javascript world so we don't really
> have where to look for precedents.
> 
> Given the above I would vote to just provide the function and leave out the
> || operator for now.

+1 for providing just functions, not operators, for this JSON so-called
"concatenation".

But it is not really concatenation, is it?  This is more like a "merge"
operation.  I mean, if you told somebody that this operation is
concatenation  'xyz foo yxz' || 'bar' --> 'xyz bar yxz'

they would laugh at you pretty hard.  But that's precisely what is
happening here:

alvherre=# select jsonb '{"a": 1, "b": 2, "c": 3}' || '{"b": 4}';        ?column?         
--------------------------{"a": 1, "b": 4, "c": 3}
(1 fila)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonb concatenate operator's semantics seem questionable

From
Ryan Pedela
Date:
On Wed, May 20, 2015 at 12:34 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

So Dmitry, at my suggestion, has come up with a way of doing that, by adding a parameter to jsonb_replace(). If this parameter is set to true (it defaults to false) and the key or array element pointed to by the last element of the path doesn't exist, it gets created.

+1 

Re: jsonb concatenate operator's semantics seem questionable

From
Josh Berkus
Date:
On 05/20/2015 11:34 AM, Andrew Dunstan wrote:
> So Dmitry, at my suggestion, has come up with a way of doing that, by
> adding a parameter to jsonb_replace(). If this parameter is set to true
> (it defaults to false) and the key or array element pointed to by the
> last element of the path doesn't exist, it gets created.

That does cover all bases, and users would be able to create the
operator which suits their particular use case easily.  It's also fairly
similar to how jsquery works, although the syntax is completely different.

But ... it's after feature freeze.  So, thoughts?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: jsonb concatenate operator's semantics seem questionable

From
Pavel Stehule
Date:
<p dir="ltr">It is like bugfix than new feature<div class="gmail_quote">Dne 20.5.2015 21:08 napsal uživatel "Josh
Berkus"<<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>>:<br type="attribution" /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 05/20/2015 11:34 AM, Andrew
Dunstanwrote:<br /> > So Dmitry, at my suggestion, has come up with a way of doing that, by<br /> > adding a
parameterto jsonb_replace(). If this parameter is set to true<br /> > (it defaults to false) and the key or array
elementpointed to by the<br /> > last element of the path doesn't exist, it gets created.<br /><br /> That does
coverall bases, and users would be able to create the<br /> operator which suits their particular use case easily. 
It'salso fairly<br /> similar to how jsquery works, although the syntax is completely different.<br /><br /> But ...
it'safter feature freeze.  So, thoughts?<br /><br /> --<br /> Josh Berkus<br /> PostgreSQL Experts Inc.<br /><a
href="http://pgexperts.com"target="_blank">http://pgexperts.com</a><br /><br /><br /> --<br /> Sent via pgsql-hackers
mailinglist (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br /> To make changes to
yoursubscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote></div> 

Re: jsonb concatenate operator's semantics seem questionable

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> That does cover all bases, and users would be able to create the
> operator which suits their particular use case easily.  It's also fairly
> similar to how jsquery works, although the syntax is completely different.

> But ... it's after feature freeze.  So, thoughts?

I think this could be seen as a correction/bug fix for a pre-freeze
feature.  We should not be too resistant to filing down rough edges
on new features, even if that involves a spec change.
        regards, tom lane



Re: jsonb concatenate operator's semantics seem questionable

From
Andres Freund
Date:
On 2015-05-20 15:37:15 -0400, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > That does cover all bases, and users would be able to create the
> > operator which suits their particular use case easily.  It's also fairly
> > similar to how jsquery works, although the syntax is completely different.
> 
> > But ... it's after feature freeze.  So, thoughts?
> 
> I think this could be seen as a correction/bug fix for a pre-freeze
> feature.  We should not be too resistant to filing down rough edges
> on new features, even if that involves a spec change.

+1



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/20/2015 03:37 PM, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> That does cover all bases, and users would be able to create the
>> operator which suits their particular use case easily.  It's also fairly
>> similar to how jsquery works, although the syntax is completely different.
>> But ... it's after feature freeze.  So, thoughts?
> I think this could be seen as a correction/bug fix for a pre-freeze
> feature.  We should not be too resistant to filing down rough edges
> on new features, even if that involves a spec change.
>
>         


OK, I'll run with that. The rest is largely bikeshedding, really.

cheers

andrew




Re: jsonb concatenate operator's semantics seem questionable

From
Robert Haas
Date:
On Wed, May 20, 2015 at 3:42 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-05-20 15:37:15 -0400, Tom Lane wrote:
>> Josh Berkus <josh@agliodbs.com> writes:
>> > That does cover all bases, and users would be able to create the
>> > operator which suits their particular use case easily.  It's also fairly
>> > similar to how jsquery works, although the syntax is completely different.
>>
>> > But ... it's after feature freeze.  So, thoughts?
>>
>> I think this could be seen as a correction/bug fix for a pre-freeze
>> feature.  We should not be too resistant to filing down rough edges
>> on new features, even if that involves a spec change.
>
> +1

+1, emphatically.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/21/2015 10:15 AM, Robert Haas wrote:
> On Wed, May 20, 2015 at 3:42 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2015-05-20 15:37:15 -0400, Tom Lane wrote:
>>> Josh Berkus <josh@agliodbs.com> writes:
>>>> That does cover all bases, and users would be able to create the
>>>> operator which suits their particular use case easily.  It's also fairly
>>>> similar to how jsquery works, although the syntax is completely different.
>>>> But ... it's after feature freeze.  So, thoughts?
>>> I think this could be seen as a correction/bug fix for a pre-freeze
>>> feature.  We should not be too resistant to filing down rough edges
>>> on new features, even if that involves a spec change.
>> +1
> +1, emphatically.
>


I appreciate your enthusiasm :-)

Here is a patch that renames jsonb_replace to jsonb_set with a boolean
create_missing flag that defaults to false (should we default it to
true?). With the flag set it's more or less upsert for jsonb. Without,
it's just update.

While the patch is sizable, it's mostly a) docs, b) regression changes,
and c) code changes from changing jsonb_replace to jsonb_set (and
replacePath to setPath in jsonfuncs.c). The actual significant code
changes are still quite small, around 200 lines.

I added a note in the docs to make the semantics of jsonb || jsonb much
more explicit.

This change really makes this set of jsonb features quite a bit more
compelling. I'm glad I thought of it - wish I had done so earlier. So
notwithstanding the controversy upthread, I think this is a good result.

cheers

andrew

Attachment

Re: jsonb concatenate operator's semantics seem questionable

From
Jim Nasby
Date:
On 5/21/15 4:25 PM, Andrew Dunstan wrote:
> Here is a patch that renames jsonb_replace to jsonb_set with a boolean
> create_missing flag that defaults to false (should we default it to
> true?). With the flag set it's more or less upsert for jsonb. Without,
> it's just update.

I think upsert is probably the more expected behavior.

Though, I'm also wondering if we should allow for throwing an error if 
path doesn't already exist (it looks like if create_missing is false it 
silently does nothing right now?)
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/22/2015 02:38 PM, Jim Nasby wrote:
> On 5/21/15 4:25 PM, Andrew Dunstan wrote:
>> Here is a patch that renames jsonb_replace to jsonb_set with a boolean
>> create_missing flag that defaults to false (should we default it to
>> true?). With the flag set it's more or less upsert for jsonb. Without,
>> it's just update.
>
> I think upsert is probably the more expected behavior.
>
> Though, I'm also wondering if we should allow for throwing an error if 
> path doesn't already exist (it looks like if create_missing is false 
> it silently does nothing right now?)

Yes, that's actually documented in the patch.

As for raising an error, in principle it's doable, but the code to 
detect it might get messy. Also, I don't want a huge number of knobs. So 
I'm excited about the idea.

cheers

andrew




Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Fri, May 22, 2015 at 11:59 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> As for raising an error, in principle it's doable, but the code to detect it
> might get messy. Also, I don't want a huge number of knobs. So I'm excited
> about the idea.

I think that that's a bad default behavior, although I don't think
that's what Jim means. Consider our experience with having subscript
operators throw errors -- it complicates certain cases (my complaint
at the time was about expression indexes, but there are others).

-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/22/2015 03:27 PM, Peter Geoghegan wrote:
> On Fri, May 22, 2015 at 11:59 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> As for raising an error, in principle it's doable, but the code to detect it
>> might get messy. Also, I don't want a huge number of knobs. So I'm excited
>> about the idea.
> I think that that's a bad default behavior, although I don't think
> that's what Jim means. Consider our experience with having subscript
> operators throw errors -- it complicates certain cases (my complaint
> at the time was about expression indexes, but there are others).
>


I certainly agree about indexable operations. However this seems 
unlikely to be indexed, although I'm prepared to be educated on that point.

Still I'd rather not add yet another parameter to the function, and I 
certainly don't want to make throwing an error the only behaviour.


cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Jim Nasby
Date:
On 5/22/15 2:44 PM, Andrew Dunstan wrote:
>
> On 05/22/2015 03:27 PM, Peter Geoghegan wrote:
>> On Fri, May 22, 2015 at 11:59 AM, Andrew Dunstan <andrew@dunslane.net>
>> wrote:
>>> As for raising an error, in principle it's doable, but the code to
>>> detect it
>>> might get messy. Also, I don't want a huge number of knobs. So I'm
>>> excited
>>> about the idea.
>> I think that that's a bad default behavior, although I don't think
>> that's what Jim means. Consider our experience with having subscript
>> operators throw errors -- it complicates certain cases (my complaint
>> at the time was about expression indexes, but there are others).
>>
>
>
> I certainly agree about indexable operations. However this seems
> unlikely to be indexed, although I'm prepared to be educated on that point.
>
> Still I'd rather not add yet another parameter to the function, and I
> certainly don't want to make throwing an error the only behaviour.

If instead of a create_missing boolean it accepted an enum we could 
handle both (since they're related). But I'd also like to avoid Yet More 
Knobs if possible.

I think there's essentially two scenarios for JSON usage; one where you 
want to be pretty paranoid about things like keys aren't missing, you're 
not trying to access a path that doesn't exist, etc. The other mode 
(what we have today) is when you really don't care much about that stuff 
and want the database to JustStoreIt. I don't know how many people would 
want the stricter mode, but it certainly seems painful to try and 
enforce that stuff today if you care about it.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: jsonb concatenate operator's semantics seem questionable

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 5/22/15 2:44 PM, Andrew Dunstan wrote:
>> Still I'd rather not add yet another parameter to the function, and I
>> certainly don't want to make throwing an error the only behaviour.

> If instead of a create_missing boolean it accepted an enum we could 
> handle both (since they're related). But I'd also like to avoid Yet More 
> Knobs if possible.

> I think there's essentially two scenarios for JSON usage; one where you 
> want to be pretty paranoid about things like keys aren't missing, you're 
> not trying to access a path that doesn't exist, etc. The other mode 
> (what we have today) is when you really don't care much about that stuff 
> and want the database to JustStoreIt. I don't know how many people would 
> want the stricter mode, but it certainly seems painful to try and 
> enforce that stuff today if you care about it.

ISTM that the use case for JSON is pretty much JustStoreIt.  If you had
strict structural expectations you'd probably have chosen a more
relational representation in the first place ... or else XML, which at
least has heard of schemas and validation.  So I definitely agree that
we need the no-error case, and am not that excited about having an
error-throwing variant.
        regards, tom lane



Re: jsonb concatenate operator's semantics seem questionable

From
Jim Nasby
Date:
On 5/22/15 4:54 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> On 5/22/15 2:44 PM, Andrew Dunstan wrote:
>>> Still I'd rather not add yet another parameter to the function, and I
>>> certainly don't want to make throwing an error the only behaviour.
>
>> If instead of a create_missing boolean it accepted an enum we could
>> handle both (since they're related). But I'd also like to avoid Yet More
>> Knobs if possible.
>
>> I think there's essentially two scenarios for JSON usage; one where you
>> want to be pretty paranoid about things like keys aren't missing, you're
>> not trying to access a path that doesn't exist, etc. The other mode
>> (what we have today) is when you really don't care much about that stuff
>> and want the database to JustStoreIt. I don't know how many people would
>> want the stricter mode, but it certainly seems painful to try and
>> enforce that stuff today if you care about it.
>
> ISTM that the use case for JSON is pretty much JustStoreIt.  If you had
> strict structural expectations you'd probably have chosen a more
> relational representation in the first place ... or else XML, which at
> least has heard of schemas and validation.  So I definitely agree that
> we need the no-error case, and am not that excited about having an
> error-throwing variant.

I think the validation case would be if you're doing transforms or other 
things to the JSON in SQL, to make sure it's matching what you're 
expecting it to. For example, if you have something in json that 
actually has duplicated keys, if you simply cast that to jsonb then all 
but one of the dupes is silently dropped. I don't like just assuming 
that's OK. There's probably other cases like this.

That said, I don't think users have pushed our JSON stuff enough yet to 
do more than guess at these use cases. Presumably it will be easier to 
tell if this is a problem as people start using the more advanced 
operators and functions.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Fri, May 22, 2015 at 2:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think there's essentially two scenarios for JSON usage; one where you
>> want to be pretty paranoid about things like keys aren't missing, you're
>> not trying to access a path that doesn't exist, etc. The other mode
>> (what we have today) is when you really don't care much about that stuff
>> and want the database to JustStoreIt. I don't know how many people would
>> want the stricter mode, but it certainly seems painful to try and
>> enforce that stuff today if you care about it.
>
> ISTM that the use case for JSON is pretty much JustStoreIt.  If you had
> strict structural expectations you'd probably have chosen a more
> relational representation in the first place ... or else XML, which at
> least has heard of schemas and validation.  So I definitely agree that
> we need the no-error case, and am not that excited about having an
> error-throwing variant.

I agree. The basic idea of JSON is that the schema is implicit. This
gives applications flexibility (but typically they require just a
little flexibility, and understand that JSON documents are still
fairly homogeneously structured).

Anyone that doesn't like that can just not use the JSON types, or can
use a check constraint with the JSON types.
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Peter Geoghegan
Date:
On Thu, May 21, 2015 at 2:25 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> This change really makes this set of jsonb features quite a bit more
> compelling. I'm glad I thought of it - wish I had done so earlier. So
> notwithstanding the controversy upthread, I think this is a good result.

I think that we should look into making jsonb support array-style
subscripting within updates (to update "nested subdatums" directly).
This would make the new concatenate operator a lot more compelling.
Also, UPDATE targetlists don't accept a table qualification in their
targetlist (for the assign-to column) because the parser similarly
needs to support updating composite type's "nested subdatums"
directly.

Having gone to the trouble of making the parser support this stuff (in
a way that makes us not follow the SQL standard in a couple of
places), we ought to have a similar capability for jsonb. I haven't
looked into it, but it seems like a good project for 9.6. I'm not
volunteering to undertake the project, though.
-- 
Peter Geoghegan



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/24/2015 03:17 PM, Peter Geoghegan wrote:
> On Thu, May 21, 2015 at 2:25 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> This change really makes this set of jsonb features quite a bit more
>> compelling. I'm glad I thought of it - wish I had done so earlier. So
>> notwithstanding the controversy upthread, I think this is a good result.
> I think that we should look into making jsonb support array-style
> subscripting within updates (to update "nested subdatums" directly).
> This would make the new concatenate operator a lot more compelling.
> Also, UPDATE targetlists don't accept a table qualification in their
> targetlist (for the assign-to column) because the parser similarly
> needs to support updating composite type's "nested subdatums"
> directly.
>
> Having gone to the trouble of making the parser support this stuff (in
> a way that makes us not follow the SQL standard in a couple of
> places), we ought to have a similar capability for jsonb. I haven't
> looked into it, but it seems like a good project for 9.6. I'm not
> volunteering to undertake the project, though.

Yes, sounds like it would be good. I too am not volunteering.

cheers

andrew



Re: jsonb concatenate operator's semantics seem questionable

From
Andres Freund
Date:
On 2015-05-24 12:17:35 -0700, Peter Geoghegan wrote:
> Having gone to the trouble of making the parser support this stuff (in
> a way that makes us not follow the SQL standard in a couple of
> places), we ought to have a similar capability for jsonb. I haven't
> looked into it, but it seems like a good project for 9.6. I'm not
> volunteering to undertake the project, though.

I'm not convinced. The array stuff requires ugly contortions in a bunch
of places, and it's likely going to be worse for jsonb.

Greetings,

Andres Freund



Re: jsonb concatenate operator's semantics seem questionable

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2015-05-24 12:17:35 -0700, Peter Geoghegan wrote:
>> Having gone to the trouble of making the parser support this stuff (in
>> a way that makes us not follow the SQL standard in a couple of
>> places), we ought to have a similar capability for jsonb. I haven't
>> looked into it, but it seems like a good project for 9.6. I'm not
>> volunteering to undertake the project, though.

> I'm not convinced. The array stuff requires ugly contortions in a bunch
> of places, and it's likely going to be worse for jsonb.

FWIW, I've got some interest myself in the idea of allowing subscripting
syntax to be applied to things other than plain arrays, which I think is
what Peter is proposing here.  You could imagine applying it to hstore,
for example, and ending up with something that acts like a Perl hash
(and even performs similarly, once you'd invented an expanded-object
representation for hstore).  Coming up with a non-ugly API for datatypes
would be the hard part.
        regards, tom lane



Re: jsonb concatenate operator's semantics seem questionable

From
Andrew Dunstan
Date:
On 05/24/2015 05:38 PM, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2015-05-24 12:17:35 -0700, Peter Geoghegan wrote:
>>> Having gone to the trouble of making the parser support this stuff (in
>>> a way that makes us not follow the SQL standard in a couple of
>>> places), we ought to have a similar capability for jsonb. I haven't
>>> looked into it, but it seems like a good project for 9.6. I'm not
>>> volunteering to undertake the project, though.
>> I'm not convinced. The array stuff requires ugly contortions in a bunch
>> of places, and it's likely going to be worse for jsonb.
> FWIW, I've got some interest myself in the idea of allowing subscripting
> syntax to be applied to things other than plain arrays, which I think is
> what Peter is proposing here.  You could imagine applying it to hstore,
> for example, and ending up with something that acts like a Perl hash
> (and even performs similarly, once you'd invented an expanded-object
> representation for hstore).  Coming up with a non-ugly API for datatypes
> would be the hard part.
>
>             


Interesting, you do cast a wide net these days.

I imagine we'd have each type register a function along the lines of
   foo_set(target foo, newval element_of_foo, path variadic "any")   returns boolean


And then we'd turn
   set myfoo[bar][baz][blurfl] = someval


into
   foo_set(myfoo, someval, bar, baz, blurfl)


In the catalog I guess we'd need to store the oid of the function, and 
possibly oid of the element type (e.g. for jsonb it would just be 
jsonb), and some dependency information.

But I'm sure there a a great many wrinkles I haven't thought of.


cheers

andrew