Thread: BUG #9519: Allows storing scalar json, but fails when querying

BUG #9519: Allows storing scalar json, but fails when querying

From
alf.kristian@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      9519
Logged by:          Alf Kristian Støyle
Email address:      alf.kristian@gmail.com
PostgreSQL version: 9.3.2
Operating system:   Red Hat 4.6.3-2
Description:

Steps to reproduce:
create table jtest (data json);
=> CREATE TABLE

insert into jtest (data) values ('1');
=> INSERT 0 1

select data->>'foo' from jtest;
=> ERROR:  cannot extract element from a scalar


I think the insert should fail, since '1' is not valid JSON.

After the data is in the database every query using the ->> operator,
hitting the row containing '1' will fail.

Our database runs in Amazon RDS, 'select version();' returns:
PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3
20120306 (Red Hat 4.6.3-2), 64-bit

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Marko Tiikkaja
Date:
On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:
> I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object.  If you only want to
store JSON objects in the table, consider using a CHECK constraint.


Regards,
Marko Tiikkaja

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
John R Pierce
Date:
On 3/10/2014 12:52 PM, Marko Tiikkaja wrote:
> On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:
>> I think the insert should fail, since '1' is not valid JSON.
>
> It's a valid "JSON value", it's just not an object.

then why does querying it fail ?



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Alf Kristian Støyle
Date:
If the JSON datatype accepts JSON values (not just objects), which I supposed is a valid approach, then shouldn't the JSON query operators be able to deal with that?

This is the "unfortunate" part:
select data->>'foo' from jtest;
=> ERROR:  cannot extract element from a scalar

So either, only accept JSON object (not simple JSON values), or make the JSON query operators work with JSON values as well.

Having to add an extra check constraint here is not obvious, imho.

For the record, we are working around this problem, through a small hack. Some other system is storing this kind of data in our database. We have reported a bug with them to fix it.

This is not a big problem for us, but we love Postgres, so we thought we should report this.

Cheers,
Alf



On 10 March 2014 20:52, Marko Tiikkaja <marko@joh.to> wrote:
On 3/10/14, 4:09 PM, alf.kristian@gmail.com wrote:
I think the insert should fail, since '1' is not valid JSON.

It's a valid "JSON value", it's just not an object.  If you only want to store JSON objects in the table, consider using a CHECK constraint.


Regards,
Marko Tiikkaja

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
John R Pierce
Date:
On 3/10/2014 1:39 PM, Alf Kristian Støyle wrote:
> This is the "unfortunate" part:
> /select data->>'foo' from jtest;
> => ERROR:  cannot extract element from a scalar
> /
> So either, only accept JSON object (not simple JSON values), or make
> the JSON query operators work with JSON values as well.

the json ->> 'fieldname' operator has no meaning when applied to a
scalar value.    what would you suggest it should do?   return a NULL ?



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
David Johnston
Date:
alf.kristian wrote
> If the JSON datatype accepts JSON values (not just objects), which I
> supposed is a valid approach, then shouldn't the JSON query operators be
> able to deal with that?
>
> This is the "unfortunate" part:
>
>
> *select data->>'foo' from jtest; => ERROR:  cannot extract element from a
> scalar*
> So either, only accept JSON object (not simple JSON values), or make the
> JSON query operators work with JSON values as well.
>
> Having to add an extra check constraint here is not obvious, imho.
>
> For the record, we are working around this problem, through a small hack.
> Some other system is storing this kind of data in our database. We have
> reported a bug with them to fix it.
>
> This is not a big problem for us, but we love Postgres, so we thought we
> should report this.
>
> Cheers,
> Alf
>
>
>
> On 10 March 2014 20:52, Marko Tiikkaja <

> marko@

> > wrote:
>
>> On 3/10/14, 4:09 PM,

> alf.kristian@

>  wrote:
>>
>>> I think the insert should fail, since '1' is not valid JSON.
>>>
>>
>> It's a valid "JSON value", it's just not an object.  If you only want to
>> store JSON objects in the table, consider using a CHECK constraint.
>>
>>
>> Regards,
>> Marko Tiikkaja
>>

Not really sure how robust you expect the system to be in the face of
polymorphic columns.

SELECT ('["a","b","c"]'::json)->>'not_a_key'  -- ERROR: cannot extract field
from a non-object

The system supposes that, at a structural level, you are dealing with
column-consistent data and so if you ask for something that does not make
sense (i.e., an object key when you have an array or a scalar) it will warn
you.

I guess, in theory, any de-referencing that does not find a valid target
could return NULL...though I'm not sure that is an improvement.

This is a relational database and so it is expected that a column defines a
single thing and that thing naturally can be one-of a scalar, object, or
array.  Given that underlying assumption - though likely never truly spelled
out anywhere in the documentation - allowing and then throwing a run-time
error when the specific sub-type of json does not match the given operator
makes sense.

David J.






--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795449.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
David Johnston
Date:
John R Pierce wrote
> On 3/10/2014 1:39 PM, Alf Kristian St=C3=B8yle wrote:
>> This is the "unfortunate" part:
>> /select data->>'foo' from jtest;
>> =3D> ERROR:  cannot extract element from a scalar
>> /
>> So either, only accept JSON object (not simple JSON values), or make=20
>> the JSON query operators work with JSON values as well.
>=20
> the json ->> 'fieldname' operator has no meaning when applied to a=20
> scalar value.    what would you suggest it should do?   return a NULL ?

"key does not exist" is represented by NULL if the operator is applied to a=
n
object so, in theory, since the key does not exist in a scalar the same
value - NULL - would not be unexpected.

From a theory perspective, and based upon typical database normalization
rules, the current behavior makes the most sense but it does force the user
to be consistent in defining of their JSON models - a little big-brother-is=
h
but if you are only going to support a single set of rules the more-strict
ones are generally preferable.

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9=
519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795451.ht=
ml
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Jeff Janes
Date:
On Mon, Mar 10, 2014 at 8:09 AM, <alf.kristian@gmail.com> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      9519
> Logged by:          Alf Kristian St=F8yle
> Email address:      alf.kristian@gmail.com
> PostgreSQL version: 9.3.2
> Operating system:   Red Hat 4.6.3-2
> Description:
>
> Steps to reproduce:
> create table jtest (data json);
> =3D> CREATE TABLE
>
> insert into jtest (data) values ('1');
> =3D> INSERT 0 1
>
> select data->>'foo' from jtest;
> =3D> ERROR:  cannot extract element from a scalar
>
>
> I think the insert should fail, since '1' is not valid JSON.
>
> After the data is in the database every query using the ->> operator,
> hitting the row containing '1' will fail.
>

Lets say the value was instead {"a":1}.

Now every query using data->'a'->>'b' will fail when it hits that row.

So forbidding values does not fix the problem, it just moves it down a
level.

A possible solution is to make ->> return NULL (like it does for accessing
values of non-existent keys) rather than raise an error when used on a
scalar.  Whether this would be an improvement, I don't know.

Note that the construct:
data #> '{a,b}'
does return null in this case, and does not raise an error.  You could
argue that that is an inconsistency.  On the other hand, you could argue it
provides you with the flexibility to accomplish different things depending
on which you desire.

So if you want the NULL behavior, you could use this to get it:

data #>> '{foo}'


Cheers,

Jeff

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Christian Kruse
Date:
Hi,

On 10/03/14 14:09, David Johnston wrote:

> The system supposes that, at a structural level, you are dealing with
> column-consistent data and so if you ask for something that does not make
> sense (i.e., an object key when you have an array or a scalar) it will wa=
rn
> you.

Looking at the common usage of JSON it doesn't seem to be a good idea
to imply column-consistent JSON. Normally when JSON is used the
application ensures consistency and when a non-existent key is
accessed NULL is returned. I think this is expected behavior and we
should do it like that, too; if one wants it different, he should use
CHECK constraints.

Best regards,

--=20
 Christian Kruse               http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Alf Kristian Støyle
Date:
The #> operator works in SELECT, e.g. does not fail on when JSON column contains JSON values. Thanks for the tip!

However when using it in the WHERE clause I get no result.

select * from jtest;
    data    
-------------
 1
 1
 {"a" : "b"}
(3 rows)


select data #> '{"a"}' from jtest;
 ?column?
----------
 
 
 "b"
(3 rows)


select data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
ERROR:  operator does not exist: json = unknown
LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
                                                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.


select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
 ?column?
----------
(0 rows)


Am I doing a wrong conversion here, or is something else going on? If the data in the database did not contain scalar values, then ->> works fine in WHERE. The following is almost the query we are actually trying run (checking for existence):

select data->>'a' from jtest where data->>'a' = 'b';
 ?column?
----------
 b
(1 row)


Regarding the ->> operator, I think it is unfortunate behavior it fails like that, I suppose we were expecting NULL behavior. However we are working around this, so if you don't think this should change, then we are fine with that :)

Just a note though. It took us a while to track down the problem. We have a table with several million rows, and suddenly our queries started failing, since someone had started to insert scalars. Others might also struggle to figure out what is wrong if they bump into this behavior.

Cheers,
Alf

 


On 10 March 2014 22:42, Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Mar 10, 2014 at 8:09 AM, <alf.kristian@gmail.com> wrote:
The following bug has been logged on the website:

Bug reference:      9519
Logged by:          Alf Kristian Støyle
Email address:      alf.kristian@gmail.com
PostgreSQL version: 9.3.2
Operating system:   Red Hat 4.6.3-2
Description:

Steps to reproduce:
create table jtest (data json);
=> CREATE TABLE

insert into jtest (data) values ('1');
=> INSERT 0 1


select data->>'foo' from jtest;
=> ERROR:  cannot extract element from a scalar


I think the insert should fail, since '1' is not valid JSON.

After the data is in the database every query using the ->> operator,
hitting the row containing '1' will fail.

Lets say the value was instead {"a":1}.

Now every query using data->'a'->>'b' will fail when it hits that row.

So forbidding values does not fix the problem, it just moves it down a level.

A possible solution is to make ->> return NULL (like it does for accessing values of non-existent keys) rather than raise an error when used on a scalar.  Whether this would be an improvement, I don't know.

Note that the construct:
data #> '{a,b}'
does return null in this case, and does not raise an error.  You could argue that that is an inconsistency.  On the other hand, you could argue it provides you with the flexibility to accomplish different things depending on which you desire.

So if you want the NULL behavior, you could use this to get it:

data #>> '{foo}'

 
Cheers,

Jeff

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
David Johnston
Date:
Christian Kruse-4 wrote
> Hi,
>
> On 10/03/14 14:09, David Johnston wrote:
>
>> The system supposes that, at a structural level, you are dealing with
>> column-consistent data and so if you ask for something that does not make
>> sense (i.e., an object key when you have an array or a scalar) it will
>> warn
>> you.
>
> Looking at the common usage of JSON it doesn't seem to be a good idea
> to imply column-consistent JSON. Normally when JSON is used the
> application ensures consistency and when a non-existent key is
> accessed NULL is returned. I think this is expected behavior and we
> should do it like that, too; if one wants it different, he should use
> CHECK constraints.

The database is the application.  It's job is to store data so that,
multiple, other applications can access it and in such a way that those
applications do not have to enforce data integrity since the database has
taken care of that responsibility.

I can see it both ways and so maybe a set of null-returning functions and
operators need to be made standard so the user can choose which behavior is
desired.  I cannot see changing the behavior of the current operators since
these are in the wild - though turning errors into non-errors isn't really a
risk since no existing code can reasonably be said to rely on such behavior.
The advantage of avoiding data-specific errors is probably worth it.  It
makes finding problems harder, not easier, to locate if one doesn't make
liberal use of check constraints but personally I'd rather have some wrong
data interspersed with the good data instead of throwing out al the data
because one single value is bad - a value that may never make it to the
final result anyway.

David J.






--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-9519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795536.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
David Johnston
Date:
Alf Kristian St=C3=B8yle wrote
> select data #> '{"a"}' from jtest;
>  ?column?
> ----------
>=20
>=20
>  "b"
> (3 rows)
>=20
>=20
> select data #> '{"a"}' from jtest where (data #> '{"a"}')::text =3D 'b';
>  ?column?
> ----------
> (0 rows)
>=20
>=20
> Am I doing a wrong conversion here, or is something else going on? If the
> data in the database did not contain scalar values, then ->> works fine i=
n
> WHERE. The following is almost the query we are actually trying run
> (checking for existence):

Side not - please follow the example set by others and trim your quoting an=
d
bottom-post.

As to your comment - the example above shows that the where clause is
evaluating to:

Where '"b"' =3D 'b' -- which is false.

Maybe try #>> which is defined to return text instead of an object.

David J.





--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9=
519-Allows-storing-scalar-json-but-fails-when-querying-tp5795417p5795538.ht=
ml
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Jeff Janes
Date:
On Tue, Mar 11, 2014 at 1:32 AM, Alf Kristian St=F8yle <alf.kristian@gmail.=
com
> wrote:

>
>
> select data #> '{"a"}' from jtest where (data #> '{"a"}') =3D 'b';
> ERROR:  operator does not exist: json =3D unknown
> LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') =3D 'b';
>                                                                  ^
> HINT:  No operator matches the given name and argument type(s). You might
> need to add explicit type casts.
>

"#>>" returns text directly, just like ->> vs ->.


>
> select data #> '{"a"}' from jtest where (data #> '{"a"}')::text =3D 'b';
>  ?column?
> ----------
> (0 rows)
>
>
> Am I doing a wrong conversion here, or is something else going on?
>

If you put the  (data #> '{"a"}')::text construct in the select, you can
see what is going on.  pulling out the element as JSON quotes the value (if
it not a number), because that is what JSON values are supposed to be, and
then converting to text leaves the quotes in place.  So you are comparing
the 3 character '"b"' to the one character string 'b', and they are not
equal.  If you use #>>, it is pulled out as text in the first place and the
quotes are not put on there.



> If the data in the database did not contain scalar values, then ->> works
> fine in WHERE. The following is almost the query we are actually trying r=
un
> (checking for existence):
>
> select data->>'a' from jtest where data->>'a' =3D 'b';
>  ?column?
> ----------
>  b
> (1 row)
>
>
> Regarding the ->> operator, I think it is unfortunate behavior it fails
> like that, I suppose we were expecting NULL behavior. However we are
> working around this, so if you don't think this should change, then we ar=
e
> fine with that :)
>

I don't really have an opinion on that, it just isn't obvious which way is
better--I can see times I would want either one. There are people who have
thought about this much more deeply than I have, but they haven't shown up
on this thread yet.  (I think they are too busy over on the hackers list,
arguing over what behavior the next generation of json operators should
have.)

You can create a new operator with the behavior you want.  I would like
some simple notation one could add to an operator or function invocation
which means "catch errors and convert to null", as I have several plperl
functions which I have created in two forms, one with an eval block and one
without.  It would be nice to have one function with a run-time notation to
distinguish the behavior.

Cheers,

Jeff

Re: BUG #9519: Allows storing scalar json, but fails when querying

From
Alf Kristian Støyle
Date:
Thank you for your help, and thorough explanation.

Creating a new operator, is way out of my league, and we now have a good way to work around our problems. So we are very pleased :)

I suppose if no one else complains, this shouldn't be fixed/changed, and you shouldn't spend time creating a new operator.

Thanks again. Cheers,
Alf



On 11 March 2014 17:03, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Mar 11, 2014 at 1:32 AM, Alf Kristian Støyle <alf.kristian@gmail.com> wrote:


select data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
ERROR:  operator does not exist: json = unknown
LINE 1: ...CT data #> '{"a"}' from jtest where (data #> '{"a"}') = 'b';
                                                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

"#>>" returns text directly, just like ->> vs ->.



select data #> '{"a"}' from jtest where (data #> '{"a"}')::text = 'b';
 ?column?
----------
(0 rows)


Am I doing a wrong conversion here, or is something else going on?

If you put the  (data #> '{"a"}')::text construct in the select, you can see what is going on.  pulling out the element as JSON quotes the value (if it not a number), because that is what JSON values are supposed to be, and then converting to text leaves the quotes in place.  So you are comparing the 3 character '"b"' to the one character string 'b', and they are not equal.  If you use #>>, it is pulled out as text in the first place and the quotes are not put on there.

 
If the data in the database did not contain scalar values, then ->> works fine in WHERE. The following is almost the query we are actually trying run (checking for existence):

select data->>'a' from jtest where data->>'a' = 'b';
 ?column?
----------
 b
(1 row)


Regarding the ->> operator, I think it is unfortunate behavior it fails like that, I suppose we were expecting NULL behavior. However we are working around this, so if you don't think this should change, then we are fine with that :)

I don't really have an opinion on that, it just isn't obvious which way is better--I can see times I would want either one. There are people who have thought about this much more deeply than I have, but they haven't shown up on this thread yet.  (I think they are too busy over on the hackers list, arguing over what behavior the next generation of json operators should have.)

You can create a new operator with the behavior you want.  I would like some simple notation one could add to an operator or function invocation which means "catch errors and convert to null", as I have several plperl functions which I have created in two forms, one with an eval block and one without.  It would be nice to have one function with a run-time notation to distinguish the behavior.
 
Cheers,

Jeff