Thread: [HACKERS] Cast jsonb to numeric, int, float, bool
Now the simplest way to extract booleans and numbers from json/jsonb is to cast it to text and then cast to the appropriate type: postgres=# select 'true'::jsonb::text::bool; bool ------ t postgres=# select '1.0'::jsonb::text::numeric; numeric --------- 1.0 This patch implements direct casts from jsonb numeric (jbvNumeric) to numeric, int4 and float8, and from jsonb bool (jbvBool) to bool. postgres=# select 'true'::jsonb::bool; bool ------ t postgres=# select '1.0'::jsonb::numeric; numeric --------- 1.0 Waiting for your feedback. If you find it useful, I can also add support of json and other types, such as smallint and bigint. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 01.02.2017 14:21,Anastasia Lubennikova wrote: > Now the simplest way to extract booleans and numbers from json/jsonb is > to cast it to text and then cast to the appropriate type: ... > This patch implements direct casts from jsonb numeric (jbvNumeric) to > numeric, int4 and float8, and from jsonb bool (jbvBool) to bool. Thank you for this patch. I always wanted to add such casts by myself. > If you find it useful, I can also add support of json and other types, > such as smallint and bigint. Yes, I'd like to have support for other types and maybe for json. Some comments about the code: I think it would be better to * add function for extraction of scalars from pseudo-arrays *iterate until WJB_DONE to pfree iterator Example: static bool JsonbGetScalar(Jsonb *jb, JsonbValue *v) { JsonbIterator *it; JsonbIteratorToken tok; JsonbValue jbv; if (!JB_ROOT_IS_SCALAR(jb)) return false; /* * A root scalar is stored as an array of one element, so we get the * array and then its first (and only)member. */ it = JsonbIteratorInit(&jb->root); tok = JsonbIteratorNext(&it, &jbv, true); Assert(tok == WJB_BEGIN_ARRAY); tok = JsonbIteratorNext(&it, v, true); Assert(tok == WJB_ELEM); tok = JsonbIteratorNext(&it, &jbv, true); Assert(tok == WJB_END_ARRAY); tok = JsonbIteratorNext(&it, &jbv, true); Assert(tok == WJB_DONE); return true; } Datum jsonb_int4(PG_FUNCTION_ARGS) { Jsonb *in = PG_GETARG_JSONB(0); JsonbValue v; if (!JsonbGetScalar(in, &v) || v.type != jbvNumeric) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("key value must be json numeric"))); PG_RETURN_INT32(DatumGetInt32(DirectFunctionCall1(numeric_int4, NumericGetDatum(v.val.numeric)))); } -- Nikita Glukhov Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
On 2/1/17 8:26 AM, Nikita Glukhov wrote: >> If you find it useful, I can also add support of json and other types, >> such as smallint and bigint. > > Yes, I'd like to have support for other types and maybe for json. There's been previous discussion about something similar, which would be better supporting "casting an unknown to smallint". IIRC there's some non-trivial problem with trying to support that. > Some comments about the code: I think it would be better to > * add function for extraction of scalars from pseudo-arrays > * iterate until WJB_DONE to pfree iterator I'm not sure what your intent here is, but if the idea is that a json array would magically cast to a bool or a number data type I think that's a bad idea. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 02.02.2017 01:07, Jim Nasby wrote: > On 2/1/17 8:26 AM, Nikita Glukhov wrote: >> Some comments about the code: I think it would be better to >> * add function for extraction of scalars from pseudo-arrays >> * iterate until WJB_DONE to pfree iterator > > I'm not sure what your intent here is, but if the idea is that a json array > would magically cast to a bool or a number data type I think that's a bad idea. My idea, of course, is not about casting any json array to a scalar. It is only about helper subroutine for extraction of top-level jsonb scalars that are always stored as one-element pseudo-arrays with special flag F_SCALAR in the array header. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
01.02.2017 17:41, Anastasia Lubennikova: > Now the simplest way to extract booleans and numbers from json/jsonb > is to cast it > to text and then cast to the appropriate type: > > postgres=# select 'true'::jsonb::text::bool; > bool > ------ > t > postgres=# select '1.0'::jsonb::text::numeric; > numeric > --------- > 1.0 > > > This patch implements direct casts from jsonb numeric (jbvNumeric) to > numeric, int4 and float8, > and from jsonb bool (jbvBool) to bool. > > postgres=# select 'true'::jsonb::bool; > bool > ------ > t > > postgres=# select '1.0'::jsonb::numeric; > numeric > --------- > 1.0 > > > Waiting for your feedback. > If you find it useful, I can also add support of json and other types, > such as smallint and bigint. > > I totally forgot about this thread, but it is a small but useful feature. Maybe it's time to dust it off. This patch was made by request of one of our clients, and though they already use custom build, I think it would be better to have these casts in core. The patch is attached. There are no tests yet, since I don't really sure what set of types do we want to support. Now the patch provides jsonb to numeric, int4, float8 and bool. Also I have some doubts about castcontext. But 'explisit' seems to be the correct one here. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested We're using this patch and like it a lot. We store a lot of log-like data in s3-hosted .json.gz files. Sometimes we need to suddenly ingest them and calculate statistics and check the new what-ifs. We ingest data to simple single-column table with jsonb field, and then perform our calculation on top of it. Without this patch the only option to get data already parsed as numbers into jsonb into calculation is via binary-text-binarytransformation. We're relying on the patch for our custom spatial type extension and casting in it. https://github.com/gojuno/lostgis/blob/master/sql/types/type_tpv.sql#L21 For postgres installations without the patch we do WITH INOUT cast stubbing, https://github.com/gojuno/lostgis/blob/master/sql/types/__jsonb_casts.sql - but without performance benefits of raw C processing :) A thing this patch does not implement is cast from jsonb to bigint. That would be useful for transforming stored osm_id OpenStreetMap object identifiers. Right now we're stubbing it with jsonb::numeric::bigint cast, but the middle one would be nice to get rid of. The new status of this patch is: Ready for Committer
On 01.03.2018 00:43, Darafei Praliaskouski wrote:
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested We're using this patch and like it a lot. We store a lot of log-like data in s3-hosted .json.gz files. Sometimes we need to suddenly ingest them and calculate statistics and check the new what-ifs. We ingest data to simple single-column table with jsonb field, and then perform our calculation on top of it. Without this patch the only option to get data already parsed as numbers into jsonb into calculation is via binary-text-binary transformation. We're relying on the patch for our custom spatial type extension and casting in it. https://github.com/gojuno/lostgis/blob/master/sql/types/type_tpv.sql#L21 For postgres installations without the patch we do WITH INOUT cast stubbing, https://github.com/gojuno/lostgis/blob/master/sql/types/__jsonb_casts.sql - but without performance benefits of raw C processing :) A thing this patch does not implement is cast from jsonb to bigint. That would be useful for transforming stored osm_id OpenStreetMap object identifiers. Right now we're stubbing it with jsonb::numeric::bigint cast, but the middle one would be nice to get rid of. The new status of this patch is: Ready for Committer
Attached new version of the patch in which I removed duplicated code using new subroutine JsonbExtractScalar(). I am not sure what is better to do when a JSON item has an unexpected type: to throw an error or to return SQL NULL. Also JSON nulls could be converted to SQL NULLs. I should note here that expression (jb -> 'key')::datatype can be rewritten with SQL/JSON function JSON_VALUE: JSON_VALUE(jb, '$.key' RETURNING datatype ERROR ON ERROR) But by standard JSON_VALUE tries to cast string JSON items to the specified datatype too, so JSON_VALUE('{"key": "123"}'::jsonb, '$.key' RETURNING int ERROR ON ERROR) does not throw an error but returns 123. We already have jsonpath operators @#, @*, so it might be very useful if our jsonb casts were equivalent to theirs SQL/JSON analogues. For example, (jb @# '$.key')::datatype could be equivalent to JSON_VALUE(jb, '$.key' RETURNING datatype ERROR ON ERROR) or JSON_VALUE(jb, '$.key' RETURNING datatype [NULL ON ERROR]). But if we want to have NULL ON ERROR behavior (which is default in SQL/JSON) in casts, then casts should not throw any errors.
-- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
> Attached new version of the patch in which I removed duplicated code using new subroutine JsonbExtractScalar(). I am not sure what is better to do when a JSON item has an unexpected type: to throw an error or to return SQL NULL. Also JSON nulls could be converted to SQL NULLs.
I would expect it to follow whatever is happening in JavaScript.
I'm unsure about mapping of NULL and undefined/null though.
> I should note here that expression (jb -> 'key')::datatype can be rewritten with SQL/JSON function JSON_VALUE: JSON_VALUE(jb, '$.key' RETURNING datatype ERROR ON ERROR)
I would expect some casts to be implicit, so that chaining with other functions is possible:
select ST_MakePoint(r->'lon', r->'lat');
select sum(r->'income');
> But by standard JSON_VALUE tries to cast string JSON items to the specified datatype too, so JSON_VALUE('{"key": "123"}'::jsonb, '$.key' RETURNING int ERROR ON ERROR) does not throw an error but returns 123.
In actual JSON implementations number datatype is usually the one available in browsers, double precision.
For some numbers (I've met this with nanoseconds) it leads to value being changed on subsequent serializations and deserializations, so it's common to wrap them in a string to be unchanged.
So, I would expect that to work, but give me an exception if the datatype loses precision on conversion of specific value.
I would expect some casts to be implicit, so that chaining with other functions is possible:
select ST_MakePoint(r->'lon', r->'lat');
select sum(r->'income');
> But by standard JSON_VALUE tries to cast string JSON items to the specified datatype too, so JSON_VALUE('{"key": "123"}'::jsonb, '$.key' RETURNING int ERROR ON ERROR) does not throw an error but returns 123.
In actual JSON implementations number datatype is usually the one available in browsers, double precision.
For some numbers (I've met this with nanoseconds) it leads to value being changed on subsequent serializations and deserializations, so it's common to wrap them in a string to be unchanged.
So, I would expect that to work, but give me an exception if the datatype loses precision on conversion of specific value.
On 01.03.2018 11:19, Darafei "Komяpa" Praliaskouski wrote: > > Attached new version of the patch in which I removed duplicated code > using new subroutine JsonbExtractScalar(). I am not sure what is > better to do when a JSON item has an unexpected type: to throw an > error or to return SQL NULL. Also JSON nulls could be converted to SQL > NULLs. > > I would expect it to follow whatever is happening in JavaScript. > I'm unsure about mapping of NULL and undefined/null though. > > > I should note here that expression (jb -> 'key')::datatype can be > rewritten with SQL/JSON function JSON_VALUE: JSON_VALUE(jb, '$.key' > RETURNING datatype ERROR ON ERROR) > > I would expect some casts to be implicit, so that chaining with other > functions is possible: > > select ST_MakePoint(r->'lon', r->'lat'); > > select sum(r->'income'); > > > But by standard JSON_VALUE tries to cast string JSON items to the > specified datatype too, so JSON_VALUE('{"key": "123"}'::jsonb, '$.key' > RETURNING int ERROR ON ERROR) does not throw an error but returns 123. > > In actual JSON implementations number datatype is usually the one > available in browsers, double precision. > For some numbers (I've met this with nanoseconds) it leads to value > being changed on subsequent serializations and deserializations, so > it's common to wrap them in a string to be unchanged. > So, I would expect that to work, but give me an exception if the > datatype loses precision on conversion of specific value. I think that only cast to a numeric type can be made implicit, because it does not lose precision. So, sum(jsonb) will work, but ST_MakePoint(float8, float8) still will require an explicit cast. It seems that in JavaScript we can implicitly cast strings to numerics and unwrap one-element arrays. Examples from Chrome: > "123.45" / 3 41.15 > "1e100" / 3 3.333333333333333e+99 > "1e1000" / 3 Infinity > "foo" / 3 NaN > [123.45] / 3 41.15 > ["123.45"] / 3 41.15 > [123.45, 345] / 3 NaN > undefined / 3 NaN But null is converted to 0: > null / 3 0 > null + 3 3 Below are examples showing how it works with new casts and JSON_VALUE: =# SELECT '1234567890.1234567890'::jsonb::int2; ERROR: cannot cast type jsonb to smallint LINE 1: SELECT '1234567890.1234567890'::jsonb::int2; ^ =# SELECT '1234567890.1234567890'::jsonb::int4; int4 ------------ 1234567890 (1 row) =# SELECT '1234567890.1234567890'::jsonb::float4; ERROR: cannot cast type jsonb to real LINE 1: SELECT '1234567890.1234567890'::jsonb::float4; ^ =# SELECT '1234567890.1234567890'::jsonb::float8; float8 ------------------ 1234567890.12346 (1 row) =# SELECT '1234567890.1234567890'::jsonb::numeric; numeric ----------------------- 1234567890.1234567890 (1 row) =# SELECT '"1234567890.1234567890"'::jsonb::numeric; ERROR: jsonb value must be numeric =# SELECT 'null'::jsonb::numeric; ERROR: jsonb value must be numeric =# SELECT JSON_VALUE('1234567890.1234567890', '$' RETURNING int2 ERROR ON ERROR); ERROR: smallint out of range =# SELECT JSON_VALUE('1234567890.1234567890', '$' RETURNING int4 ERROR ON ERROR); json_value ------------ 1234567890 (1 row) =# SELECT JSON_VALUE('1234567890.1234567890', '$' RETURNING float4 ERROR ON ERROR); json_value ------------- 1.23457e+09 (1 row) =# SELECT JSON_VALUE('1234567890.1234567890', '$' RETURNING float8 ERROR ON ERROR); json_value ------------------ 1234567890.12346 (1 row) =# SELECT JSON_VALUE('1234567890.1234567890', '$' RETURNING numeric ERROR ON ERROR); json_value ----------------------- 1234567890.1234567890 (1 row) =# SELECT JSON_VALUE('"1234567890.1234567890"', '$' RETURNING int2 ERROR ON ERROR); ERROR: value "1234567890.1234567890" is out of range for type smallint =# SELECT JSON_VALUE('"1234567890.1234567890"', '$' RETURNING int4 ERROR ON ERROR); ERROR: invalid input syntax for integer: "1234567890.1234567890" =# SELECT JSON_VALUE('"1234567890.1234567890"', '$' RETURNING float4 ERROR ON ERROR); json_value ------------- 1.23457e+09 (1 row) =# SELECT JSON_VALUE('"1234567890.1234567890"', '$' RETURNING float8 ERROR ON ERROR); json_value ------------------ 1234567890.12346 (1 row) =# SELECT JSON_VALUE('"1234567890.1234567890"', '$' RETURNING numeric ERROR ON ERROR); json_value ----------------------- 1234567890.1234567890 (1 row) =# SELECT JSON_VALUE('"foo"', '$' RETURNING numeric ERROR ON ERROR); ERROR: invalid input syntax for type numeric: "foo" =# SELECT JSON_VALUE('null', '$' RETURNING numeric ERROR ON ERROR); json_value ------------ (1 row) =# SELECT JSON_VALUE('{}', '$' RETURNING numeric ERROR ON ERROR); ERROR: SQL/JSON scalar required =# SELECT JSON_VALUE('[]', '$' RETURNING numeric ERROR ON ERROR); ERROR: SQL/JSON scalar required -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2/28/18 7:12 PM, Nikita Glukhov wrote: > On 01.03.2018 00:43, Darafei Praliaskouski wrote: >> >> The new status of this patch is: Ready for Committer > > Attached new version of the patch in which I removed duplicated code > using new subroutine JsonbExtractScalar(). I am not sure what is better > to do when a JSON item has an unexpected type: to throw an error or to > return SQL NULL. Also JSON nulls could be converted to SQL NULLs. I > should note here that expression (jb -> 'key')::datatype can be > rewritten with SQL/JSON function JSON_VALUE: JSON_VALUE(jb, '$.key' > RETURNING datatype ERROR ON ERROR) But by standard JSON_VALUE tries to > cast string JSON items to the specified datatype too, so > JSON_VALUE('{"key": "123"}'::jsonb, '$.key' RETURNING int ERROR ON > ERROR) does not throw an error but returns 123. We already have jsonpath > operators @#, @*, so it might be very useful if our jsonb casts were > equivalent to theirs SQL/JSON analogues. For example, (jb @# > '$.key')::datatype could be equivalent to JSON_VALUE(jb, '$.key' > RETURNING datatype ERROR ON ERROR) or JSON_VALUE(jb, '$.key' RETURNING > datatype [NULL ON ERROR]). But if we want to have NULL ON ERRORbehavior > (which is default in SQL/JSON) in casts, then casts should not throw any > errors. Since this patch was updated after being set a Ready for Committer and there appear to be some open questions, I have set it to Needs Review. Thanks, -- -David david@pgmasters.net
I think that only cast to a numeric type can be made implicit, because
it does not lose precision.
So, sum(jsonb) will work, but ST_MakePoint(float8, float8) still will
require an explicit cast.
What would be required to make ST_MakePoint(x, y) work?
Will ST_MakePoint(numeric, numeric) wrapper that performs the explicit be enough?
Below are examples showing how it works with new casts and JSON_VALUE:
=# SELECT '1234567890.1234567890'::jsonb::int2;
=# SELECT '1234567890.1234567890'::jsonb::int4;
=# SELECT '1234567890.1234567890'::jsonb::float4;
=# SELECT '1234567890.1234567890'::jsonb::float8;
=# SELECT '1234567890.1234567890'::jsonb::numeric;
=# SELECT '"1234567890.1234567890"'::jsonb::numeric;
I would expect these to be equivalent to:
select ('"1234567890.1234567890"'::jsonb->>0)::numeric;
it probably makes sense in other cases:
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::int2;
ERROR: 22003: value "1234567890.1234567890" is out of range for type smallint
LOCATION: pg_atoi, numutils.c:83
-- another error message here ("cannot fit into type") will be fine here:
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::int4;
ERROR: 22P02: invalid input syntax for integer: "1234567890.1234567890"
LOCATION: pg_atoi, numutils.c:106
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::float4;
┌─────────────┐
│ float4 │
├─────────────┤
│ 1.23457e+09 │
└─────────────┘
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::float8;
┌──────────────────┐
│ float8 │
├──────────────────┤
│ 1234567890.12346 │
└──────────────────┘
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::numeric;
┌───────────────────────┐
│ numeric │
├───────────────────────┤
│ 1234567890.1234567890 │
└───────────────────────┘
ERROR: 22003: value "1234567890.1234567890" is out of range for type smallint
LOCATION: pg_atoi, numutils.c:83
-- another error message here ("cannot fit into type") will be fine here:
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::int4;
ERROR: 22P02: invalid input syntax for integer: "1234567890.1234567890"
LOCATION: pg_atoi, numutils.c:106
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::float4;
┌─────────────┐
│ float4 │
├─────────────┤
│ 1.23457e+09 │
└─────────────┘
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::float8;
┌──────────────────┐
│ float8 │
├──────────────────┤
│ 1234567890.12346 │
└──────────────────┘
[local] gis@gis=# SELECT ('1234567890.1234567890'::jsonb->>0)::numeric;
┌───────────────────────┐
│ numeric │
├───────────────────────┤
│ 1234567890.1234567890 │
└───────────────────────┘
[local] gis@gis=# SELECT ('null'::jsonb->>0)::numeric;
┌─────────┐
│ numeric │
├─────────┤
│ ¤ │
└─────────┘
[local] gis@gis=# SELECT ('"1234567890.1234567890"'::jsonb->>0)::numeric;
┌───────────────────────┐
│ numeric │
├───────────────────────┤
│ 1234567890.1234567890 │
└───────────────────────┘
Does this make sense, or are there hidden issues in this logic? :)
┌─────────┐
│ numeric │
├─────────┤
│ ¤ │
└─────────┘
[local] gis@gis=# SELECT ('"1234567890.1234567890"'::jsonb->>0)::numeric;
┌───────────────────────┐
│ numeric │
├───────────────────────┤
│ 1234567890.1234567890 │
└───────────────────────┘
Does this make sense, or are there hidden issues in this logic? :)
Darafei Praliaskouski,
GIS Engineer / Juno Minsk
GIS Engineer / Juno Minsk
Nikita Glukhov <n.gluhov@postgrespro.ru> writes: > On 01.03.2018 11:19, Darafei "Komяpa" Praliaskouski wrote: >> I would expect some casts to be implicit, so that chaining with other >> functions is possible: > I think that only cast to a numeric type can be made implicit, because > it does not lose precision. I hadn't been following this thread particularly, but I happened to notice this bit, and I thought I'd better pop up to say No Way. There will be *no* implicit casts from json to any numeric type. We have learned the hard way that implicit cross-category casts are dangerous. See the advice in the CREATE CAST man page: It is wise to be conservative about marking casts as implicit. An overabundance of implicit casting paths can cause PostgreSQL to choose surprising interpretations of commands, or to be unable to resolve commands at all because there are multiple possible interpretations. A good rule of thumb is to make a cast implicitly invokable only for information-preserving transformations between types in the same general type category. For example, the cast from int2 to int4 can reasonably be implicit, but the cast from float8 to int4 should probably be assignment-only. Cross-type-category casts, such as text to int4, are best made explicit-only. regards, tom lane
Hi Tom,
I hadn't been following this thread particularly, but I happened to notice
this bit, and I thought I'd better pop up to say No Way. There will be
*no* implicit casts from json to any numeric type. We have learned the
hard way that implicit cross-category casts are dangerous.
I can see how a cast from text can be problematic, especially before the 'unknown' type.
But what would be the scenario of failure if we have an implicit cast from jsonb datatype (that likely already parsed the number internally, or knows it holds non-numeric value) to numeric, that returns an error if it's unable to perform the conversion?
The issue I think is that jsonb is special because it is in fact container. We've got dot-notation to unpack things from composite TYPE, and we've got arrow-notation to unpack things from jsonb, but such unpacking cannot take part in further computations, even though it's visually compatible.
What would be other options, if not implicit cast?
Darafei Praliaskouski,
GIS Engineer / Juno Minsk
GIS Engineer / Juno Minsk
=?UTF-8?Q?Darafei_=22Kom=D1=8Fpa=22_Praliaskouski?= <me@komzpa.net> writes: > But what would be the scenario of failure if we have an implicit cast from > jsonb datatype (that likely already parsed the number internally, or knows > it holds non-numeric value) to numeric, that returns an error if it's > unable to perform the conversion? One fundamental problem is this: what's special about a cast to numeric? There's no reason to assume that a jsonb field is more likely to be numeric than anything else. But basically you only get to have one of these. If we put this in, and then somebody else comes along and proposes an implicit cast to text, or an implicit cast to timestamp, then everything is broken, because the parser will no longer be able to resolve max(jsonb) --- it won't know which implicit cast to apply. Another fundamental problem is that implicit casts mask mistakes. If there's an implicit cast to numeric, that applies everywhere not only where it was what you meant. For some context on this you might go back to the archives around the time of 8.3, where we actually removed a bunch of implicit casts because they led to too many surprising behaviors. Restricting implicit casts to the same type category is a rule-of-thumb for reducing the potential surprise factor. The cast notation is important here because it lets/makes the user specify which semantics she wants for the converted value. I think it's about equally likely for people to be converting JSON fields to text, or (some form of) numeric, or datetime, so I don't think it's appropriate to privilege one of those over all others. > What would be other options, if not implicit cast? Explicit casts, ie (jsonvar->'fieldname')::numeric. Yeah, you have to write a bit more, but you don't get surprised by the way the parser interpreted your query. The other thing you can potentially do is use a variant operator name, as we did for text output with ->>. But that doesn't scale to very many result types, because it's impossible to choose readily mnemonic operator names. So I'd stick with the casts. regards, tom lane
Hello everyone, > Since this patch was updated after being set a Ready for Committer and > there appear to be some open questions, I have set it to Needs Review. I decided to take a look. The patch didn't apply after changes made in fd1a421f, but I fixed that. Also there were no tests. I fixed that too. I believe it's "Ready for Committer". -- Best regards, Aleksander Alekseev
Attachment
Hi! I took a look on patch and it seems to me it looks both incomplete and oveflowing. It suggests cast from numeric and bool, but for numeric it suggests only numeric, int4 and int8. This choice looks irrational. I think, it should support from/to numeric/bool/text only. If we want to have casts to from numeric to other numeric types then it should be full set (int2, int4, int8, float4, float8). Aleksander Alekseev wrote: > Hello everyone, > >> Since this patch was updated after being set a Ready for Committer and >> there appear to be some open questions, I have set it to Needs Review. > > I decided to take a look. > > The patch didn't apply after changes made in fd1a421f, but I fixed that. > Also there were no tests. I fixed that too. > > I believe it's "Ready for Committer". > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Mon, Mar 12, 2018 at 12:43:20PM -0400, Tom Lane wrote: > Another fundamental problem is that implicit casts mask mistakes. > If there's an implicit cast to numeric, that applies everywhere not > only where it was what you meant. For some context on this you might > go back to the archives around the time of 8.3, where we actually > removed a bunch of implicit casts because they led to too many > surprising behaviors. Restricting implicit casts to the same type > category is a rule-of-thumb for reducing the potential surprise > factor. +1 to that. >> What would be other options, if not implicit cast? > > Explicit casts, ie (jsonvar->'fieldname')::numeric. Yeah, you have > to write a bit more, but you don't get surprised by the way the > parser interpreted your query. > > The other thing you can potentially do is use a variant operator > name, as we did for text output with ->>. But that doesn't scale > to very many result types, because it's impossible to choose > readily mnemonic operator names. So I'd stick with the casts. And +1 to that. Implicit casts are cool things when they don't cause a potential loss of precision, which could be equivalent to losing some data, so sticking with casts looks more acceptable to me, and I would vote to not move on with this patch. -- Michael
Attachment
> I think, it should support from/to numeric/bool/text only. If we want to have > casts to from numeric to other numeric types then it should be full set (int2, > int4, int8, float4, float8). I was too optimistic about casting to/from text. It already exists and it works by differ way from suggested casts: 1) select '"foo"'::jsonb::text; -> "foo" // with "" 2) select '123'::jsonb::text; -> 123 3) select '123'::jsonb::int4; -> 123 Seems, sometime it would be desirable result in 1) as just 'foo' without "" decoration, but we cannot have 2 different explicit casts for the same types. So, I withdraw objections about text casting, but I still believe that we need one of two variants: 1) numeric/bool 2) bool, numeric/all variants of numeric types -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Thanks for everyone, pushed with some editorization Teodor Sigaev wrote: >> I think, it should support from/to numeric/bool/text only. If we want to have >> casts to from numeric to other numeric types then it should be full set (int2, >> int4, int8, float4, float8). > > I was too optimistic about casting to/from text. It already exists and it works > by differ way from suggested casts: > > 1) select '"foo"'::jsonb::text; -> "foo" // with "" > 2) select '123'::jsonb::text; -> 123 > 3) select '123'::jsonb::int4; -> 123 > > Seems, sometime it would be desirable result in 1) as just > 'foo' without "" decoration, but we cannot have 2 different explicit casts for > the same types. So, I withdraw objections about text casting, but I still > believe that we need one of two variants: > 1) numeric/bool > 2) bool, numeric/all variants of numeric types > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Thu, Mar 29, 2018 at 9:35 AM, Teodor Sigaev <teodor@sigaev.ru> wrote: > Thanks for everyone, pushed with some editorization I would like to complain about this patch. First, I think that it would've been a better idea to use functions for this rather than operators, because now ::text does something totally unlike what ::int does, and that's confusing. If we had json_to_WHATEVER for various values of WHATEVER then all of the conversions could be spelled similarly; as the commit message right points out, the cast can only do one thing. Also, I think the error messages aren't great: +select '[]'::jsonb::bool; +ERROR: jsonb value must be boolean In this simple scenario, it's clear enough what has gone wrong, but in a more complicated case I suspect people will have a hard time figuring out what the source of that error message is. It seems like it would be better to say something about casting or converting in the error message, to give users a clue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hello Robert, > I would like to complain about this patch. First, I think that it > would've been a better idea to use functions for this rather than > operators, because now ::text does something totally unlike what ::int > does, and that's confusing. This is not entirely accurate. ::text casts any jsonb to text, e.g: ``` select '{}'::jsonb::text; text ------ {} (1 row) ``` This is exactly what ::bool and ::numeric casts do. Naturally not every jsonb value can be represented as boolean or numeric though. > If we had json_to_WHATEVER for various values of WHATEVER then all of > the conversions could be spelled similarly; as the commit message > right points out, the cast can only do one thing. I believe it's not as convenient for users as having casts. Also this would be inconsistent with the rest of the system since we already have int4 -> text, int4 -> bool and various other casts. > Also, I think the error messages aren't great: > > +select '[]'::jsonb::bool; > +ERROR: jsonb value must be boolean > > In this simple scenario, it's clear enough what has gone wrong, but in > a more complicated case I suspect people will have a hard time > figuring out what the source of that error message is. It seems like > it would be better to say something about casting or converting in the > error message, to give users a clue. I agree. How about "unable to cast jsonb value to %typename%"? -- Best regards, Aleksander Alekseev
Attachment
> I would like to complain about this patch. First, I think that it > would've been a better idea to use functions for this rather than > operators, because now ::text does something totally unlike what ::int > does, and that's confusing. If we had json_to_WHATEVER for various > values of WHATEVER then all of the conversions could be spelled > similarly; as the commit message right points out, the cast can only > do one thing. From another point of view, casting jsonb to text produces completely grounded result: we get a text correctly formatted as json. Other casts produce correct json but with non-text type. Casting jsonb with text is two-way casting: # select '123'::jsonb::text::jsonb, '"xxx"'::jsonb::text::jsonb; jsonb | jsonb -------+------- 123 | "xxx" But casting with numeric types and bool is not, but it could be done with intermediate cast to text (uppercase cast): # select '123'::jsonb::int::TEXT::jsonb; jsonb ------- 123 For completeness it's possible to add direct cast from numeric/boolean types to jsonb. Then such casts will be mutual. > Also, I think the error messages aren't great: > > +select '[]'::jsonb::bool; > +ERROR: jsonb value must be boolean > > In this simple scenario, it's clear enough what has gone wrong, but in > a more complicated case I suspect people will have a hard time > figuring out what the source of that error message is. It seems like > it would be better to say something about casting or converting in the > error message, to give users a clue. Agree, something like "could not convert jsonb value to boolean type. jsonb value must be scalar boolean type"? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Hello Teodor, > For completeness it's possible to add direct cast from numeric/boolean types > to jsonb. Then such casts will be mutual. +1. I see no reason why we can't have int4 -> jsonb or bool -> jsonb casts. > Agree, something like "could not convert jsonb value to boolean type. jsonb > value must be scalar boolean type"? I checked what error messages are used currently: ``` # select 123::int4::jsonb; ERROR: cannot cast type integer to jsonb LINE 1: select 123::int4::jsonb; ``` I suggest to follow the same template, i.e. "cannot cast type jsonb to bool", etc. -- Best regards, Aleksander Alekseev
Attachment
Hello Teodor, > > Agree, something like "could not convert jsonb value to boolean type. jsonb > > value must be scalar boolean type"? > > I checked what error messages are used currently: > > ``` > # select 123::int4::jsonb; > ERROR: cannot cast type integer to jsonb > LINE 1: select 123::int4::jsonb; > ``` > > I suggest to follow the same template, i.e. "cannot cast type jsonb to > bool", etc. On second thought this message is misleading. We can cat jsonb to bool, the problem is that the document is not a bool value. -- Best regards, Aleksander Alekseev
Attachment
Aleksander Alekseev <a.alekseev@postgrespro.ru> writes: > Hello Teodor, > >> > Agree, something like "could not convert jsonb value to boolean type. jsonb >> > value must be scalar boolean type"? >> >> I checked what error messages are used currently: >> >> ``` >> # select 123::int4::jsonb; >> ERROR: cannot cast type integer to jsonb >> LINE 1: select 123::int4::jsonb; >> ``` >> >> I suggest to follow the same template, i.e. "cannot cast type jsonb to >> bool", etc. > > On second thought this message is misleading. We can cat jsonb to bool, > the problem is that the document is not a bool value. How about "cannot cast jsonb $json_type to $sql_type" where $json_type is the type inside the jsonb (e.g. string, number, boolean, array, object)? - ilmari -- "The surreality of the universe tends towards a maximum" -- Skud's Law "Never formulate a law or axiom that you're not prepared to live with the consequences of." -- Skud's Meta-Law
On Fri, Mar 30, 2018 at 11:21 AM, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote: > How about "cannot cast jsonb $json_type to $sql_type" where $json_type > is the type inside the jsonb (e.g. string, number, boolean, array, > object)? Yes, that sounds pretty good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>> How about "cannot cast jsonb $json_type to $sql_type" where $json_type >> is the type inside the jsonb (e.g. string, number, boolean, array, >> object)? > > Yes, that sounds pretty good. Does anybody have an objections to patch? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev <teodor@sigaev.ru> writes: > Does anybody have an objections to patch? 1) Does this really pass muster from the translatability standpoint? I doubt it. I'd expect the translation of "cannot cast jsonb string to int4" to use a translated equivalent of "string", but this code will not do that. You can't really fix it by gettext'ing the result of JsonbTypeName, either, because then you're breaking the rule about not assembling translated strings from components. 2) Our usual convention for type names that are written in error messages is to use the SQL-standard names, that is "integer" and "double precision" and so on. For instance, float8in and int4in do not use the internal type names in their messages: regression=# select 'bogus'::float8; ERROR: invalid input syntax for type double precision: "bogus" LINE 1: select 'bogus'::float8; ^ regression=# select 'bogus'::int4; ERROR: invalid input syntax for integer: "bogus" LINE 1: select 'bogus'::int4; ^ So I think you made the wrong choices in jsonb_int4 etc. regards, tom lane
I wrote: > 1) Does this really pass muster from the translatability standpoint? > I doubt it. After further thought about that, it seems that what we typically don't try to translate is SQL-standard type names, that is, error messages along the line of "blah blah blah type %s" are considered fine. So the problem here is that you factorized the error reporting poorly. I think you want the callers to look like if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric) cannotCastJsonbValue(v.type, "double precision"); where the subroutine contains the whole ereport() call, and its lookup table entries are e.g. gettext_noop("cannot cast jsonb string to type %s") regards, tom lane
>> 1) Does this really pass muster from the translatability standpoint? >> I doubt it. Huh, I missed that. > I think you want the callers to look like > > if (!JsonbExtractScalar(&in->root, &v) || v.type != jbvNumeric) > cannotCastJsonbValue(v.type, "double precision"); > > where the subroutine contains the whole ereport() call, and its lookup > table entries are e.g. > > gettext_noop("cannot cast jsonb string to type %s") Thanks for your idea, patch is attached -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev <teodor@sigaev.ru> writes: > Thanks for your idea, patch is attached Looks mostly fine from here. A couple nitpicks: * s/translable/translatable/ * Personally I'd try harder to make the lookup table constant, that is + static const struct + { + enum jbvType type; + const char *msg; + } + messages[] = + { in hopes that it'd end up in the text segment. regards, tom lane