Thread: More new SQL/JSON item methods
Here are the brief descriptions for the same.
---
v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
This commit implements jsonpath .bigint(), .integer(), and .number()
methods. The JSON string or a numeric value is converted to the
bigint, int4, and numeric type representation.
---
v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch
This commit implements jsonpath .date(), .time(), .time_tz(),
.timestamp(), .timestamp_tz() methods. The JSON string representing
a valid date/time is converted to the specific date or time type
representation.
The changes use the infrastructure of the .datetime() method and
perform the datatype conversion as appropriate. All these methods
accept no argument and use ISO datetime formats.
---
v1-0003-Implement-jsonpath-.boolean-and-.string-methods.patch
This commit implements jsonpath .boolean() and .string() methods.
.boolean() method converts the given JSON string, numeric, or boolean
value to the boolean type representation. In the numeric case, only
integers are allowed, whereas we use the parse_bool() backend function
to convert a string to a bool.
.string() method uses the datatype's out function to convert numeric
and various date/time types to the string representation.
---
v1-0004-Implement-jasonpath-.decimal-precision-scale-meth.patch
This commit implements jsonpath .decimal() method with optional
precision and scale. If precision and scale are provided, then
it is converted to the equivalent numerictypmod and applied to the
numeric number.
---
Suggestions/feedback/comments, please...
Thanks
Attachment
Hi, On 2023-08-29 03:05, Jeevan Chalke wrote: > This commit implements jsonpath .bigint(), .integer(), and .number() > --- > This commit implements jsonpath .date(), .time(), .time_tz(), > .timestamp(), .timestamp_tz() methods. > --- > This commit implements jsonpath .boolean() and .string() methods. Writing as an interested outsider to the jsonpath spec, my first question would be, is there a published jsonpath spec independent of PostgreSQL, and are these methods in it, and are the semantics identical? The question comes out of my experience on a PostgreSQL integration of XQuery/XPath, which was nontrivial because the w3 specs for those languages give rigorous definitions of their data types, independently of SQL, and a good bit of the work was squinting at those types and at the corresponding PostgreSQL types to see in what ways they were different, and what the constraints on converting them were. (Some of that squinting was already done by the SQL committee in the SQL/XML spec, which has plural pages on how those conversions have to happen, especially for the date/time types.) If I look in [1], am I looking in the right place for the most current jsonpath draft? (I'm a little squeamish reading as a goal "cover only essential parts of XPath 1.0", given that XPath 1.0 is the one w3 threw away so XPath 2.0 wouldn't have the same problems.) On details of the patch itself, I only have quick first impressions, like: - surely there's a more direct way to make boolean from numeric than to serialize the numeric and parse an int? - I notice that .bigint() and .integer() finish up by casting the value to numeric so the existing jbv->val.numeric can hold it. That may leave some opportunity on the table: there is another patch under way [2] that concerns quickly getting such result values from json operations to the surrounding SQL query. That could avoid the trip through numeric completely if the query wants a bigint, if there were a val.bigint in JsonbValue. But of course that would complicate everything else that touches JsonbValue. Is there a way for a jsonpath operator to determine that it's the terminal operation in the path, and leave a value in val.bigint if it is, or build a numeric if it's not? Then most other jsonpath code could go on expecting a numeric value is always in val.numeric, and the only code checking for a val.bigint would be code involved with getting the result value out to the SQL caller. Regards, -Chap [1] https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html [2] https://commitfest.postgresql.org/44/4476/
On 2023-08-30 11:18, Chapman Flack wrote: > If I look in [1], am I looking in the right place for the most > current jsonpath draft? My bad, I see that it is not. Um if I look in [1'], am I then looking at the same spec you are? [1'] https://www.ietf.org/archive/id/draft-ietf-jsonpath-base-20.html Regards, -Chap
On 2023-Aug-30, Chapman Flack wrote: > Hi, > > On 2023-08-29 03:05, Jeevan Chalke wrote: > > This commit implements jsonpath .bigint(), .integer(), and .number() > > --- > > This commit implements jsonpath .date(), .time(), .time_tz(), > > .timestamp(), .timestamp_tz() methods. > > --- > > This commit implements jsonpath .boolean() and .string() methods. > > Writing as an interested outsider to the jsonpath spec, my first > question would be, is there a published jsonpath spec independent > of PostgreSQL, and are these methods in it, and are the semantics > identical? Looking at the SQL standard itself, in the 2023 edition section 9.46 "SQL/JSON path language: syntax and semantics", it shows this: <JSON method> ::= type <left paren> <right paren> | size <left paren> <right paren> | double <left paren> <right paren> | ceiling <left paren> <right paren> | floor <left paren> <right paren> | abs <left paren> <right paren> | datetime <left paren> [ <JSON datetime template> ] <right paren> | keyvalue <left paren> <right paren> | bigint <left paren> <right paren> | boolean <left paren> <right paren> | date <left paren> <right paren> | decimal <left paren> [ <precision> [ <comma> <scale> ] ] <right paren> | integer <left paren> <right paren> | number <left paren> <right paren> | string <left paren> <right paren> | time <left paren> [ <time precision> ] <right paren> | time_tz <left paren> [ <time precision> ] <right paren> | timestamp <left paren> [ <timestamp precision> ] <right paren> | timestamp_tz <left paren> [ <timestamp precision> ] <right paren> and then details, for each of those, rules like III) If JM specifies <double>, then: 1) For all j, 1 (one) ≤ j ≤ n, Case: a) If I_j is not a number or character string, then let ST be data exception — non-numeric SQL/JSON item (22036). b) Otherwise, let X be an SQL variable whose value is I_j. Let V_j be the result of CAST (X AS DOUBLE PRECISION) If this conversion results in an exception condition, then let ST be that exception condition. 2) Case: a) If ST is not successful completion, then the result of JAE is ST. b) Otherwise, the result of JAE is the SQL/JSON sequence V_1, ..., V_n. so at least superficially our implementation is constrained by what the SQL standard says to do, and we should verify that this implementation matches those rules. We don't necessarily need to watch what do other specs such as jsonpath itself. > The question comes out of my experience on a PostgreSQL integration > of XQuery/XPath, which was nontrivial because the w3 specs for those > languages give rigorous definitions of their data types, independently > of SQL, and a good bit of the work was squinting at those types and at > the corresponding PostgreSQL types to see in what ways they were > different, and what the constraints on converting them were. Yeah, I think the experience of the SQL committee with XML was pretty bad, as you carefully documented. I hope they don't make such a mess with JSON. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 2023-08-30 12:28, Alvaro Herrera wrote: > Yeah, I think the experience of the SQL committee with XML was pretty > bad, as you carefully documented. I hope they don't make such a mess > with JSON. I guess the SQL committee was taken by surprise after basing something on Infoset and XPath 1.0 for 2003, and then w3 deciding those things needed to be scrapped and redone with the lessons learned. So the SQL committee had to come out with a rather different SQL/XML for 2006, but I'd say the 2003-2006 difference is the only real 'mess', and other than going back in time to unpublish 2003, I'm not sure how they'd have done better. > b) Otherwise, the result of JAE is the SQL/JSON sequence V_1, > ..., V_n. This has my Spidey sense tingling, as it seems very parallel to SQL/XML where the result of XMLQUERY is to have type XML(SEQUENCE), which is a type we do not have, and I'm not sure we have a type for "JSON sequence" either, unless SQL/JSON makes it equivalent to a JSON array (which I guess is conceivable, more easily than with XML). What does SQL/JSON say about this SQL/JSON sequence type and how it should behave? Regards, -Chap
On 8/30/23 19:20, Chapman Flack wrote: > On 2023-08-30 12:28, Alvaro Herrera wrote: >> b) Otherwise, the result of JAE is the SQL/JSON sequence V_1, >> ..., V_n. > > This has my Spidey sense tingling, as it seems very parallel to SQL/XML > where the result of XMLQUERY is to have type XML(SEQUENCE), which is a > type we do not have, and I'm not sure we have a type for "JSON sequence" > either, unless SQL/JSON makes it equivalent to a JSON array (which > I guess is conceivable, more easily than with XML). What does SQL/JSON > say about this SQL/JSON sequence type and how it should behave? The SQL/JSON data model comprises SQL/JSON items and SQL/JSON sequences. The components of the SQL/JSON data model are: — An SQL/JSON item is defined recursively as any of the following: • An SQL/JSON scalar, defined as a non-null value of any of the following predefined (SQL) types: character string with character set Unicode, numeric, Boolean, or datetime. • An SQL/JSON null, defined as a value that is distinct from any value of any SQL type. NOTE 109 — An SQL/JSON null is distinct from the SQL null value. • An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON items, called the SQL/JSON elements of the SQL/JSON array. • An SQL/JSON object, defined as an unordered collection of zero or more SQL/JSON members, where an SQL/JSON member is a pair whose first value is a character string with character set Unicode and whose second value is an SQL/JSON item. The first value of an SQL/JSON member is called the key and the second value is called the bound value. — An SQL/JSON sequence is an ordered list of zero or more SQL/JSON items. -- Vik Fearing
On 2023-08-31 20:50, Vik Fearing wrote: > — An SQL/JSON item is defined recursively as any of the following: > ... > • An SQL/JSON array, defined as an ordered list of zero or more > SQL/JSON items, called the SQL/JSON elements of the SQL/JSON > array. > ... > — An SQL/JSON sequence is an ordered list of zero or more SQL/JSON > items. As I was thinking, because "an ordered list of zero or more SQL/JSON items" is also exactly what an SQL/JSON array is, it seems at least possible to implement things that are specified to return "SQL/JSON sequence" by having them return an SQL/JSON array (the kind of thing that isn't possible for XML(SEQUENCE), because there isn't any other XML construct that can subsume it). Still, it seems noteworthy that both terms are used in the spec, rather than saying the function in question should return a JSON array. Makes me wonder if there are some other details that make the two distinct. Regards, -Chap
Looking at the SQL standard itself, in the 2023 edition section 9.46
"SQL/JSON path language: syntax and semantics", it shows this:
<JSON method> ::=
type <left paren> <right paren>
| size <left paren> <right paren>
| double <left paren> <right paren>
| ceiling <left paren> <right paren>
| floor <left paren> <right paren>
| abs <left paren> <right paren>
| datetime <left paren> [ <JSON datetime template> ] <right paren>
| keyvalue <left paren> <right paren>
| bigint <left paren> <right paren>
| boolean <left paren> <right paren>
| date <left paren> <right paren>
| decimal <left paren> [ <precision> [ <comma> <scale> ] ] <right paren>
| integer <left paren> <right paren>
| number <left paren> <right paren>
| string <left paren> <right paren>
| time <left paren> [ <time precision> ] <right paren>
| time_tz <left paren> [ <time precision> ] <right paren>
| timestamp <left paren> [ <timestamp precision> ] <right paren>
| timestamp_tz <left paren> [ <timestamp precision> ] <right paren>
and then details, for each of those, rules like
III) If JM specifies <double>, then:
1) For all j, 1 (one) ≤ j ≤ n,
Case:
a) If I_j is not a number or character string, then let ST be data
exception — non-numeric SQL/JSON item (22036).
b) Otherwise, let X be an SQL variable whose value is I_j.
Let V_j be the result of
CAST (X AS DOUBLE PRECISION)
If this conversion results in an exception condition, then
let ST be that exception condition.
2) Case:
a) If ST is not successful completion, then the result of JAE
is ST.
b) Otherwise, the result of JAE is the SQL/JSON sequence V_1,
..., V_n.
so at least superficially our implementation is constrained by what the
SQL standard says to do, and we should verify that this implementation
matches those rules. We don't necessarily need to watch what do other
specs such as jsonpath itself.
> - surely there's a more direct way to make boolean from numeric
> than to serialize the numeric and parse an int?
But looking at the PostgreSQL conversion to bool, it doesn't allow floating
point values to be converted to boolean and only accepts int4. That's why I
did the int4 conversion.
On 29.08.23 09:05, Jeevan Chalke wrote: > v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch > > This commit implements jsonpath .bigint(), .integer(), and .number() > methods. The JSON string or a numeric value is converted to the > bigint, int4, and numeric type representation. A comment that applies to all of these: These add various keywords, switch cases, documentation entries in some order. Are we happy with that? Should we try to reorder all of that for better maintainability or readability? > v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch > > This commit implements jsonpath .date(), .time(), .time_tz(), > .timestamp(), .timestamp_tz() methods. The JSON string representing > a valid date/time is converted to the specific date or time type > representation. > > The changes use the infrastructure of the .datetime() method and > perform the datatype conversion as appropriate. All these methods > accept no argument and use ISO datetime formats. These should accept an optional precision argument. Did you plan to add that? > v1-0003-Implement-jsonpath-.boolean-and-.string-methods.patch > > This commit implements jsonpath .boolean() and .string() methods. This contains a compiler warning: ../src/backend/utils/adt/jsonpath_exec.c: In function 'executeItemOptUnwrapTarget': ../src/backend/utils/adt/jsonpath_exec.c:1162:86: error: 'tmp' may be used uninitialized [-Werror=maybe-uninitialized] > v1-0004-Implement-jasonpath-.decimal-precision-scale-meth.patch > > This commit implements jsonpath .decimal() method with optional > precision and scale. If precision and scale are provided, then > it is converted to the equivalent numerictypmod and applied to the > numeric number. This also contains compiler warnings: ../src/backend/utils/adt/jsonpath_exec.c: In function 'executeItemOptUnwrapTarget': ../src/backend/utils/adt/jsonpath_exec.c:1403:53: error: declaration of 'numstr' shadows a previous local [-Werror=shadow=compatible-local] ../src/backend/utils/adt/jsonpath_exec.c:1442:54: error: declaration of 'elem' shadows a previous local [-Werror=shadow=compatible-local] There is a typo in the commit message: "Implement jasonpath" Any reason this patch is separate from 0002? Isn't number() and decimal() pretty similar? You could also update src/backend/catalog/sql_features.txt in each patch (features T865 through T878).
On Fri, Oct 6, 2023 at 7:47 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 29.08.23 09:05, Jeevan Chalke wrote: > > v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch > > > > This commit implements jsonpath .bigint(), .integer(), and .number() > > methods. The JSON string or a numeric value is converted to the > > bigint, int4, and numeric type representation. > > A comment that applies to all of these: These add various keywords, > switch cases, documentation entries in some order. Are we happy with > that? Should we try to reorder all of that for better maintainability > or readability? > > > v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch > > > > This commit implements jsonpath .date(), .time(), .time_tz(), > > .timestamp(), .timestamp_tz() methods. The JSON string representing > > a valid date/time is converted to the specific date or time type > > representation. > > > > The changes use the infrastructure of the .datetime() method and > > perform the datatype conversion as appropriate. All these methods > > accept no argument and use ISO datetime formats. > > These should accept an optional precision argument. Did you plan to add > that? compiler warnings issue resolved. I figured out how to use the precision argument. But I don't know how to get the precision argument in the parse stage. attached is my attempt to implement: select jsonb_path_query('"2017-03-10 11:11:01.123"', '$.timestamp(2)'); not that familiar with src/backend/utils/adt/jsonpath_gram.y. imitate decimal method failed. decimal has precision and scale two arguments. here only one argument. looking for hints.
Attachment
On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?
> v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch
>
> This commit implements jsonpath .date(), .time(), .time_tz(),
> .timestamp(), .timestamp_tz() methods. The JSON string representing
> a valid date/time is converted to the specific date or time type
> representation.
>
> The changes use the infrastructure of the .datetime() method and
> perform the datatype conversion as appropriate. All these methods
> accept no argument and use ISO datetime formats.
These should accept an optional precision argument. Did you plan to add
that?
> v1-0003-Implement-jsonpath-.boolean-and-.string-methods.patch
>
> This commit implements jsonpath .boolean() and .string() methods.
This contains a compiler warning:
../src/backend/utils/adt/jsonpath_exec.c: In function
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1162:86: error: 'tmp' may be
used uninitialized [-Werror=maybe-uninitialized]
> v1-0004-Implement-jasonpath-.decimal-precision-scale-meth.patch
>
> This commit implements jsonpath .decimal() method with optional
> precision and scale. If precision and scale are provided, then
> it is converted to the equivalent numerictypmod and applied to the
> numeric number.
This also contains compiler warnings:
../src/backend/utils/adt/jsonpath_exec.c: In function
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1403:53: error: declaration of
'numstr' shadows a previous local [-Werror=shadow=compatible-local]
../src/backend/utils/adt/jsonpath_exec.c:1442:54: error: declaration of
'elem' shadows a previous local [-Werror=shadow=compatible-local]
There is a typo in the commit message: "Implement jasonpath"
Any reason this patch is separate from 0002? Isn't number() and
decimal() pretty similar?
You could also update src/backend/catalog/sql_features.txt in each patch
(features T865 through T878).
On Fri, Oct 6, 2023 at 7:47 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 29.08.23 09:05, Jeevan Chalke wrote:
> > v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
> >
> > This commit implements jsonpath .bigint(), .integer(), and .number()
> > methods. The JSON string or a numeric value is converted to the
> > bigint, int4, and numeric type representation.
>
> A comment that applies to all of these: These add various keywords,
> switch cases, documentation entries in some order. Are we happy with
> that? Should we try to reorder all of that for better maintainability
> or readability?
>
> > v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch
> >
> > This commit implements jsonpath .date(), .time(), .time_tz(),
> > .timestamp(), .timestamp_tz() methods. The JSON string representing
> > a valid date/time is converted to the specific date or time type
> > representation.
> >
> > The changes use the infrastructure of the .datetime() method and
> > perform the datatype conversion as appropriate. All these methods
> > accept no argument and use ISO datetime formats.
>
> These should accept an optional precision argument. Did you plan to add
> that?
compiler warnings issue resolved.
I figured out how to use the precision argument.
But I don't know how to get the precision argument in the parse stage.
attached is my attempt to implement: select
jsonb_path_query('"2017-03-10 11:11:01.123"', '$.timestamp(2)');
not that familiar with src/backend/utils/adt/jsonpath_gram.y. imitate
decimal method failed. decimal has precision and scale two arguments.
here only one argument.
looking for hints.
Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
> v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch
>
> This commit implements jsonpath .date(), .time(), .time_tz(),
> .timestamp(), .timestamp_tz() methods. The JSON string representing
> a valid date/time is converted to the specific date or time type
> representation.
>
> The changes use the infrastructure of the .datetime() method and
> perform the datatype conversion as appropriate. All these methods
> accept no argument and use ISO datetime formats.
These should accept an optional precision argument. Did you plan to add
that?Yeah, will add that.
> v1-0003-Implement-jsonpath-.boolean-and-.string-methods.patch
>
> This commit implements jsonpath .boolean() and .string() methods.
This contains a compiler warning:
../src/backend/utils/adt/jsonpath_exec.c: In function
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1162:86: error: 'tmp' may be
used uninitialized [-Werror=maybe-uninitialized]
> v1-0004-Implement-jasonpath-.decimal-precision-scale-meth.patch
>
> This commit implements jsonpath .decimal() method with optional
> precision and scale. If precision and scale are provided, then
> it is converted to the equivalent numerictypmod and applied to the
> numeric number.
This also contains compiler warnings:Thanks, for reporting these warnings. I don't get those on my machine, thus missed them. Will fix them.
../src/backend/utils/adt/jsonpath_exec.c: In function
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1403:53: error: declaration of
'numstr' shadows a previous local [-Werror=shadow=compatible-local]
../src/backend/utils/adt/jsonpath_exec.c:1442:54: error: declaration of
'elem' shadows a previous local [-Werror=shadow=compatible-local]
There is a typo in the commit message: "Implement jasonpath"Will fix.
Any reason this patch is separate from 0002? Isn't number() and
decimal() pretty similar?Since DECIMAL has precision and scale arguments, I have implemented that at the end. I tried merging that with 0001, but other patches ended up with the conflicts and thus I didn't merge that and kept it as a separate patch. But yes, logically it belongs to the 0001 group. My bad that I haven't put in that extra effort. Will do that in the next version. Sorry for the same.
You could also update src/backend/catalog/sql_features.txt in each patch
(features T865 through T878).OK.
Thanks--
Attachment
On Mon, Oct 23, 2023 at 3:29 PM Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote: > > Attached are all three patches fixing the above comments. > minor issue: /src/backend/utils/adt/jsonpath_exec.c 2531: Timestamp result; 2532: ErrorSaveContext escontext = {T_ErrorSaveContext}; 2533: 2534: /* Get a warning when precision is reduced */ 2535: time_precision = anytimestamp_typmod_check(false, 2536: time_precision); 2537: result = DatumGetTimestamp(value); 2538: AdjustTimestampForTypmod(&result, time_precision, 2539: (Node *) &escontext); 2540: if (escontext.error_occurred) 2541: RETURN_ERROR(ereport(ERROR, 2542: (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION), 2543: errmsg("numeric argument of jsonpath item method .%s() is out of range for type integer", 2544: jspOperationName(jsp->type))))); you already did anytimestamp_typmod_check. So this "if (escontext.error_occurred)" is unnecessary? A similar case applies to another function called anytimestamp_typmod_check. /src/backend/utils/adt/jsonpath_exec.c 1493: /* Convert numstr to Numeric with typmod */ 1494: Assert(numstr != NULL); 1495: noerr = DirectInputFunctionCallSafe(numeric_in, numstr, 1496: InvalidOid, dtypmod, 1497: (Node *) &escontext, 1498: &numdatum); 1499: 1500: if (!noerr || escontext.error_occurred) 1501: RETURN_ERROR(ereport(ERROR, 1502: (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM), 1503: errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or number", 1504: jspOperationName(jsp->type))))); inside DirectInputFunctionCallSafe already "if (SOFT_ERROR_OCCURRED(escontext))" so "if (!noerr || escontext.error_occurred)" change to "if (!noerr)" should be fine?
Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
I think it would be better to organize things how we want them before adding in more stuff.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2023-10-19 Th 02:06, Jeevan Chalke wrote:Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
I think it would be better to organize things how we want them before adding in more stuff.
In some switch cases, they are still divided, like in flattenJsonPathParseItem(), where 2-arg, 1-arg, and no-arg cases are clubbed together. But I have tried to keep them in order in those subgroups.
I will rebase my patches for this task on this patch, but before doing so, I would like to get your views on this reordering.
Thanks
Attachment
Hello,On Tue, Oct 24, 2023 at 6:41 PM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-10-19 Th 02:06, Jeevan Chalke wrote:Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
I think it would be better to organize things how we want them before adding in more stuff.
I have tried reordering all the jsonpath Operators and Methods consistently. With this patch, they all appear in the same order when together in the group.
In some switch cases, they are still divided, like in flattenJsonPathParseItem(), where 2-arg, 1-arg, and no-arg cases are clubbed together. But I have tried to keep them in order in those subgroups.
I will rebase my patches for this task on this patch, but before doing so, I would like to get your views on this reordering.
This appears to be reasonable. Maybe we need to add a note in one or two places about maintaining the consistency?
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2023-11-01 We 03:00, Jeevan Chalke wrote:Hello,On Tue, Oct 24, 2023 at 6:41 PM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-10-19 Th 02:06, Jeevan Chalke wrote:Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
I think it would be better to organize things how we want them before adding in more stuff.
I have tried reordering all the jsonpath Operators and Methods consistently. With this patch, they all appear in the same order when together in the group.
In some switch cases, they are still divided, like in flattenJsonPathParseItem(), where 2-arg, 1-arg, and no-arg cases are clubbed together. But I have tried to keep them in order in those subgroups.
I will rebase my patches for this task on this patch, but before doing so, I would like to get your views on this reordering.
This appears to be reasonable. Maybe we need to add a note in one or two places about maintaining the consistency?
Attachment
On Wed, Nov 1, 2023 at 3:49 PM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-11-01 We 03:00, Jeevan Chalke wrote:Hello,On Tue, Oct 24, 2023 at 6:41 PM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-10-19 Th 02:06, Jeevan Chalke wrote:Thanks, Peter for the comments.On Fri, Oct 6, 2023 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:On 29.08.23 09:05, Jeevan Chalke wrote:
> v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch
>
> This commit implements jsonpath .bigint(), .integer(), and .number()
> methods. The JSON string or a numeric value is converted to the
> bigint, int4, and numeric type representation.
A comment that applies to all of these: These add various keywords,
switch cases, documentation entries in some order. Are we happy with
that? Should we try to reorder all of that for better maintainability
or readability?Yeah, that's the better suggestion. While implementing these methods, I was confused about where to put them exactly and tried keeping them in some logical place.I think once these methods get in, we can have a follow-up patch reorganizing all of these.
I think it would be better to organize things how we want them before adding in more stuff.
I have tried reordering all the jsonpath Operators and Methods consistently. With this patch, they all appear in the same order when together in the group.
In some switch cases, they are still divided, like in flattenJsonPathParseItem(), where 2-arg, 1-arg, and no-arg cases are clubbed together. But I have tried to keep them in order in those subgroups.
I will rebase my patches for this task on this patch, but before doing so, I would like to get your views on this reordering.
This appears to be reasonable. Maybe we need to add a note in one or two places about maintaining the consistency?
+1Added a note in jsonpath.h where enums are defined.I have rebased all three patches over this reordering patch making 4 patches in the set.Let me know your views on the same.Thanks
Hi Jeevan,
I think these are in reasonably good shape, but there are a few things that concern me:
andrew@~=# select jsonb_path_query_array('[1.2]', '$[*].bigint()');
ERROR: numeric argument of jsonpath item method .bigint() is out of range for type bigint
I'm ok with this being an error, but I think the error message is wrong. It should be the "invalid input" message.
andrew@~=# select jsonb_path_query_array('[1.0]', '$[*].bigint()');
ERROR: numeric argument of jsonpath item method .bigint() is out of range for type bigint
Should we trim trailing dot+zeros from numeric values before trying to convert to bigint/int? If not, this too should be an "invalid input" case.
andrew@~=# select jsonb_path_query_array('[1.0]', '$[*].boolean()');
ERROR: numeric argument of jsonpath item method .boolean() is out of range for type boolean
It seems odd that any non-zero integer is true but not any non-zero numeric. Is that in the spec? If not I'd avoid trying to convert it to an integer first, and just check for infinity/nan before looking to see if it's zero.
The code for integer() and bigint() seems a bit duplicative, but I'm not sure there's a clean way of avoiding that.
The items for datetime types and string look OK.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi Jeevan,
I think these are in reasonably good shape, but there are a few things that concern me:
andrew@~=# select jsonb_path_query_array('[1.2]', '$[*].bigint()');
ERROR: numeric argument of jsonpath item method .bigint() is out of range for type bigintI'm ok with this being an error, but I think the error message is wrong. It should be the "invalid input" message.
andrew@~=# select jsonb_path_query_array('[1.0]', '$[*].bigint()');
ERROR: numeric argument of jsonpath item method .bigint() is out of range for type bigintShould we trim trailing dot+zeros from numeric values before trying to convert to bigint/int? If not, this too should be an "invalid input" case.
Unfortunately, I was using int8in() for the conversion of numeric values. We should be using numeric_int8() instead. However, there is no opt_error version of the same.
So, I have introduced a numeric_int8_opt_error() version just like we have one for int4, i.e. numeric_int4_opt_error(), to suppress the error. These changes are in the 0001 patch. (All other patch numbers are now increased by 1)
I have used this new function to fix this reported issue and used numeric_int4_opt_error() for integer conversion.
andrew@~=# select jsonb_path_query_array('[1.0]', '$[*].boolean()');
ERROR: numeric argument of jsonpath item method .boolean() is out of range for type booleanIt seems odd that any non-zero integer is true but not any non-zero numeric. Is that in the spec? If not I'd avoid trying to convert it to an integer first, and just check for infinity/nan before looking to see if it's zero.
# select 1.0::boolean;
ERROR: cannot cast type numeric to boolean
LINE 1: select 1.0::boolean;
The code for integer() and bigint() seems a bit duplicative, but I'm not sure there's a clean way of avoiding that.
The items for datetime types and string look OK.
Attachment
- v4-0001-Add-numeric_int8_opt_error-to-optionally-suppress.patch
- v4-0002-Reorganise-jsonpath-Operators-and-Methods.patch
- v4-0005-Implement-jsonpath-.boolean-and-.string-methods.patch
- v4-0003-Implement-jsonpath-.number-.decimal-precision-sca.patch
- v4-0004-Implement-jsonpath-.date-.time-.time_tz-.timestam.patch
On 07.12.23 14:24, Jeevan Chalke wrote: > We have the same issue with integer conversion and need a fix. > > Unfortunately, I was using int8in() for the conversion of numeric > values. We should be using numeric_int8() instead. However, there is no > opt_error version of the same. > > So, I have introduced a numeric_int8_opt_error() version just like we > have one for int4, i.e. numeric_int4_opt_error(), to suppress the error. > These changes are in the 0001 patch. (All other patch numbers are now > increased by 1) > > I have used this new function to fix this reported issue and used > numeric_int4_opt_error() for integer conversion. I have committed the 0001 and 0002 patches for now. The remaining patches look reasonable to me, but I haven't reviewed them in detail.
On 03.01.24 13:01, Peter Eisentraut wrote: > On 07.12.23 14:24, Jeevan Chalke wrote: >> We have the same issue with integer conversion and need a fix. >> >> Unfortunately, I was using int8in() for the conversion of numeric >> values. We should be using numeric_int8() instead. However, there is >> no opt_error version of the same. >> >> So, I have introduced a numeric_int8_opt_error() version just like we >> have one for int4, i.e. numeric_int4_opt_error(), to suppress the >> error. These changes are in the 0001 patch. (All other patch numbers >> are now increased by 1) >> >> I have used this new function to fix this reported issue and used >> numeric_int4_opt_error() for integer conversion. > > I have committed the 0001 and 0002 patches for now. > > The remaining patches look reasonable to me, but I haven't reviewed them > in detail. The 0002 patch had to be reverted, because we can't change the order of the enum values in JsonPathItemType. I have instead committed a different patch that adjusts the various switch cases to observe the current order of the enum. That also means that the remaining patches that add new item methods need to add the new enum values at the end and adjust the rest of their code accordingly.
On 03.01.24 13:01, Peter Eisentraut wrote:
> On 07.12.23 14:24, Jeevan Chalke wrote:
>> We have the same issue with integer conversion and need a fix.
>>
>> Unfortunately, I was using int8in() for the conversion of numeric
>> values. We should be using numeric_int8() instead. However, there is
>> no opt_error version of the same.
>>
>> So, I have introduced a numeric_int8_opt_error() version just like we
>> have one for int4, i.e. numeric_int4_opt_error(), to suppress the
>> error. These changes are in the 0001 patch. (All other patch numbers
>> are now increased by 1)
>>
>> I have used this new function to fix this reported issue and used
>> numeric_int4_opt_error() for integer conversion.
>
> I have committed the 0001 and 0002 patches for now.
>
> The remaining patches look reasonable to me, but I haven't reviewed them
> in detail.
The 0002 patch had to be reverted, because we can't change the order of
the enum values in JsonPathItemType. I have instead committed a
different patch that adjusts the various switch cases to observe the
current order of the enum. That also means that the remaining patches
that add new item methods need to add the new enum values at the end and
adjust the rest of their code accordingly.
I will work on rebasing and reorganizing the remaining patches.
Thanks
On Thu, Jan 4, 2024 at 2:34 AM Peter Eisentraut <peter@eisentraut.org> wrote:On 03.01.24 13:01, Peter Eisentraut wrote:
> On 07.12.23 14:24, Jeevan Chalke wrote:
>> We have the same issue with integer conversion and need a fix.
>>
>> Unfortunately, I was using int8in() for the conversion of numeric
>> values. We should be using numeric_int8() instead. However, there is
>> no opt_error version of the same.
>>
>> So, I have introduced a numeric_int8_opt_error() version just like we
>> have one for int4, i.e. numeric_int4_opt_error(), to suppress the
>> error. These changes are in the 0001 patch. (All other patch numbers
>> are now increased by 1)
>>
>> I have used this new function to fix this reported issue and used
>> numeric_int4_opt_error() for integer conversion.
>
> I have committed the 0001 and 0002 patches for now.
>
> The remaining patches look reasonable to me, but I haven't reviewed them
> in detail.
The 0002 patch had to be reverted, because we can't change the order of
the enum values in JsonPathItemType. I have instead committed a
different patch that adjusts the various switch cases to observe the
current order of the enum. That also means that the remaining patches
that add new item methods need to add the new enum values at the end and
adjust the rest of their code accordingly.Thanks, Peter.
I will work on rebasing and reorganizing the remaining patches.
Attachment
Attached are two small fixup patches for your patch set. In the first one, I simplified the grammar for the .decimal() method. It seemed a bit overkill to build a whole list structure when all we need are 0, 1, or 2 arguments. Per SQL standard, the precision and scale arguments are unsigned integers, so unary plus and minus signs are not supported. So my patch removes that support, but I didn't adjust the regression tests for that. Also note that in your 0002 patch, the datetime precision is similarly unsigned, so that's consistent. By the way, in your 0002 patch, don't see the need for the separate datetime_method grammar rule. You can fold that into accessor_op. Overall, I think it would be better if you combined all three of these patches into one. Right now, you have arranged these as incremental features, and as a result of that, the additions to the JsonPathItemType enum and the grammar keywords etc. are ordered in the way you worked on these features, I guess. It would be good to maintain a bit of sanity to put all of this together and order all the enums and everything else for example in the order they are in the sql_features.txt file (which is alphabetical, I suppose). At this point I suspect we'll end up committing this whole feature set together anyway, so we might as well organize it that way.
Attachment
Attached are two small fixup patches for your patch set.
In the first one, I simplified the grammar for the .decimal() method.
It seemed a bit overkill to build a whole list structure when all we
need are 0, 1, or 2 arguments.
Per SQL standard, the precision and scale arguments are unsigned
integers, so unary plus and minus signs are not supported. So my patch
removes that support, but I didn't adjust the regression tests for that.
# select '12345'::numeric(4,-2);
numeric
---------
12300
(1 row)
And thus thought of supporting those.
Do we want this JSON item method to behave differently here?
Also note that in your 0002 patch, the datetime precision is similarly
unsigned, so that's consistent.
By the way, in your 0002 patch, don't see the need for the separate
datetime_method grammar rule. You can fold that into accessor_op.
Overall, I think it would be better if you combined all three of these
patches into one. Right now, you have arranged these as incremental
features, and as a result of that, the additions to the JsonPathItemType
enum and the grammar keywords etc. are ordered in the way you worked on
these features, I guess. It would be good to maintain a bit of sanity
to put all of this together and order all the enums and everything else
for example in the order they are in the sql_features.txt file (which is
alphabetical, I suppose). At this point I suspect we'll end up
committing this whole feature set together anyway, so we might as well
organize it that way.
I will merge them all into one and will try to keep them in the order specified in sql_features.txt.
However, for documentation, it makes more sense to keep them in logical order than the alphabetical one. What are your views on this?
On Mon, Jan 15, 2024 at 7:41 PM Peter Eisentraut <peter@eisentraut.org> wrote:
Overall, I think it would be better if you combined all three of these
patches into one. Right now, you have arranged these as incremental
features, and as a result of that, the additions to the JsonPathItemType
enum and the grammar keywords etc. are ordered in the way you worked on
these features, I guess. It would be good to maintain a bit of sanity
to put all of this together and order all the enums and everything else
for example in the order they are in the sql_features.txt file (which is
alphabetical, I suppose). At this point I suspect we'll end up
committing this whole feature set together anyway, so we might as well
organize it that way.OK.
I will merge them all into one and will try to keep them in the order specified in sql_features.txt.
However, for documentation, it makes more sense to keep them in logical order than the alphabetical one. What are your views on this?
I agree that we should order the documentation logically. Users don't care how we organize the code etc, but they do care about docs have sensible structure.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On 17.01.24 10:03, Jeevan Chalke wrote: > I added unary '+' and '-' support as well and thus thought of having > separate rules altogether rather than folding those in. > > Per SQL standard, the precision and scale arguments are unsigned > integers, so unary plus and minus signs are not supported. So my patch > removes that support, but I didn't adjust the regression tests for that. > > > However, PostgreSQL numeric casting does support a negative scale. Here > is an example: > > # select '12345'::numeric(4,-2); > numeric > --------- > 12300 > (1 row) > > And thus thought of supporting those. > Do we want this JSON item method to behave differently here? Ok, it would make sense to support this in SQL/JSON as well. > I will merge them all into one and will try to keep them in the order > specified in sql_features.txt. > However, for documentation, it makes more sense to keep them in logical > order than the alphabetical one. What are your views on this? The documentation can be in a different order.
On 17.01.24 10:03, Jeevan Chalke wrote:
> I added unary '+' and '-' support as well and thus thought of having
> separate rules altogether rather than folding those in.
>
> Per SQL standard, the precision and scale arguments are unsigned
> integers, so unary plus and minus signs are not supported. So my patch
> removes that support, but I didn't adjust the regression tests for that.
>
>
> However, PostgreSQL numeric casting does support a negative scale. Here
> is an example:
>
> # select '12345'::numeric(4,-2);
> numeric
> ---------
> 12300
> (1 row)
>
> And thus thought of supporting those.
> Do we want this JSON item method to behave differently here?
Ok, it would make sense to support this in SQL/JSON as well.
> I will merge them all into one and will try to keep them in the order
> specified in sql_features.txt.
> However, for documentation, it makes more sense to keep them in logical
> order than the alphabetical one. What are your views on this?
The documentation can be in a different order.
Peter, I didn't understand why the changes you did in your 0002 patch were required here. I did run the pgindent, and it didn't complain to me. So, just curious to know more about the changes. I have not merged those changes in this single patch. However, if needed it can be cleanly applied on top of this single patch.
Attachment
On 18.01.24 15:25, Jeevan Chalke wrote: > Peter, I didn't understand why the changes you did in your 0002 patch > were required here. I did run the pgindent, and it didn't complain to > me. So, just curious to know more about the changes. I have not merged > those changes in this single patch. However, if needed it can be cleanly > applied on top of this single patch. I just thought it was a bit wasteful with vertical space. It's not essential.
On Thu, Jan 18, 2024 at 1:03 AM Peter Eisentraut <peter@eisentraut.org> wrote:On 17.01.24 10:03, Jeevan Chalke wrote:
> I added unary '+' and '-' support as well and thus thought of having
> separate rules altogether rather than folding those in.
>
> Per SQL standard, the precision and scale arguments are unsigned
> integers, so unary plus and minus signs are not supported. So my patch
> removes that support, but I didn't adjust the regression tests for that.
>
>
> However, PostgreSQL numeric casting does support a negative scale. Here
> is an example:
>
> # select '12345'::numeric(4,-2);
> numeric
> ---------
> 12300
> (1 row)
>
> And thus thought of supporting those.
> Do we want this JSON item method to behave differently here?
Ok, it would make sense to support this in SQL/JSON as well.OK. So with this, we don't need changes done in your 0001 patches.
> I will merge them all into one and will try to keep them in the order
> specified in sql_features.txt.
> However, for documentation, it makes more sense to keep them in logical
> order than the alphabetical one. What are your views on this?
The documentation can be in a different order.Thanks, Andrew and Peter for the confirmation.Attached merged single patch along these lines.
Thanks, I have pushed this.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > Thanks, I have pushed this. The buildfarm is pretty widely unhappy, mostly failing on select jsonb_path_query('1.23', '$.string()'); On a guess, I tried running that under valgrind, and behold it said ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) ==00:00:00:05.637 435530== at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) ==00:00:00:05.637 435530== by 0x8FED03: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530== by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) ==00:00:00:05.637 435530== by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) ==00:00:00:05.637 435530== by 0x8FFDE4: executeItem (jsonpath_exec.c:626) ==00:00:00:05.637 435530== by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) ==00:00:00:05.637 435530== by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) It's fairly obviously right about that: JsonbValue jbv; ... jb = &jbv; Assert(tmp != NULL); /* We must have set tmp above */ jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); ^^^^^^^^^^^^^^^^^^^^^ Presumably, this is a mistaken attempt to test the type of the thing previously pointed to by "jb". On the whole, what I'd be inclined to do here is get rid of this test altogether and demand that every path through the preceding "switch" deliver a value that doesn't need pstrdup(). The only path that doesn't do that already is case jbvBool: tmp = (jb->val.boolean) ? "true" : "false"; break; and TBH I'm not sure that we really need a pstrdup there either. The constants are immutable enough. Is something likely to try to pfree the pointer later? I tried @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, jb = &jbv; Assert(tmp != NULL); /* We must have set tmp above */ - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); + jb->val.string.val = tmp; jb->val.string.len = strlen(jb->val.string.val); jb->type = jbvString; and that quieted valgrind for this particular query and still passes regression. (The reported crashes seem to be happening later during a recursive invocation, seemingly because JsonbType(jb) is returning garbage. So there may be another bug after this one.) regards, tom lane
On 2024-01-25 Th 14:31, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Thanks, I have pushed this. > The buildfarm is pretty widely unhappy, mostly failing on > > select jsonb_path_query('1.23', '$.string()'); > > On a guess, I tried running that under valgrind, and behold it said > > ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) > ==00:00:00:05.637 435530== at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) > ==00:00:00:05.637 435530== by 0x8FED03: executeItem (jsonpath_exec.c:626) > ==00:00:00:05.637 435530== by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) > ==00:00:00:05.637 435530== by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) > ==00:00:00:05.637 435530== by 0x8FFDE4: executeItem (jsonpath_exec.c:626) > ==00:00:00:05.637 435530== by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) > ==00:00:00:05.637 435530== by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) > > It's fairly obviously right about that: > > JsonbValue jbv; > ... > jb = &jbv; > Assert(tmp != NULL); /* We must have set tmp above */ > jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); > ^^^^^^^^^^^^^^^^^^^^^ > > Presumably, this is a mistaken attempt to test the type > of the thing previously pointed to by "jb". > > On the whole, what I'd be inclined to do here is get rid > of this test altogether and demand that every path through > the preceding "switch" deliver a value that doesn't need > pstrdup(). The only path that doesn't do that already is > > case jbvBool: > tmp = (jb->val.boolean) ? "true" : "false"; > break; > > and TBH I'm not sure that we really need a pstrdup there > either. The constants are immutable enough. Is something > likely to try to pfree the pointer later? I tried > > @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, > > jb = &jbv; > Assert(tmp != NULL); /* We must have set tmp above */ > - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); > + jb->val.string.val = tmp; > jb->val.string.len = strlen(jb->val.string.val); > jb->type = jbvString; > > and that quieted valgrind for this particular query and still > passes regression. > > (The reported crashes seem to be happening later during a > recursive invocation, seemingly because JsonbType(jb) is > returning garbage. So there may be another bug after this one.) > > Argh! Will look, thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2024-01-25 Th 14:31, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Thanks, I have pushed this. > The buildfarm is pretty widely unhappy, mostly failing on > > select jsonb_path_query('1.23', '$.string()'); > > On a guess, I tried running that under valgrind, and behold it said > > ==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised value(s) > ==00:00:00:05.637 435530== at 0x8FD131: executeItemOptUnwrapTarget (jsonpath_exec.c:1547) > ==00:00:00:05.637 435530== by 0x8FED03: executeItem (jsonpath_exec.c:626) > ==00:00:00:05.637 435530== by 0x8FED03: executeNextItem (jsonpath_exec.c:1604) > ==00:00:00:05.637 435530== by 0x8FCA58: executeItemOptUnwrapTarget (jsonpath_exec.c:956) > ==00:00:00:05.637 435530== by 0x8FFDE4: executeItem (jsonpath_exec.c:626) > ==00:00:00:05.637 435530== by 0x8FFDE4: executeJsonPath.constprop.30 (jsonpath_exec.c:612) > ==00:00:00:05.637 435530== by 0x8FFF8C: jsonb_path_query_internal (jsonpath_exec.c:438) > > It's fairly obviously right about that: > > JsonbValue jbv; > ... > jb = &jbv; > Assert(tmp != NULL); /* We must have set tmp above */ > jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); > ^^^^^^^^^^^^^^^^^^^^^ > > Presumably, this is a mistaken attempt to test the type > of the thing previously pointed to by "jb". > > On the whole, what I'd be inclined to do here is get rid > of this test altogether and demand that every path through > the preceding "switch" deliver a value that doesn't need > pstrdup(). The only path that doesn't do that already is > > case jbvBool: > tmp = (jb->val.boolean) ? "true" : "false"; > break; > > and TBH I'm not sure that we really need a pstrdup there > either. The constants are immutable enough. Is something > likely to try to pfree the pointer later? I tried > > @@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, > > jb = &jbv; > Assert(tmp != NULL); /* We must have set tmp above */ > - jb->val.string.val = (jb->type == jbvString) ? tmp : pstrdup(tmp); > + jb->val.string.val = tmp; > jb->val.string.len = strlen(jb->val.string.val); > jb->type = jbvString; > > and that quieted valgrind for this particular query and still > passes regression. Your fix looks sane. I also don't see why we need the pstrdup. > > (The reported crashes seem to be happening later during a > recursive invocation, seemingly because JsonbType(jb) is > returning garbage. So there may be another bug after this one.) > > I don't think so. AIUI The first call deals with the '$' and the second one deals with the '.string()', which is why we see the error on the second call. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 2024-01-25 Th 14:31, Tom Lane wrote: >> (The reported crashes seem to be happening later during a >> recursive invocation, seemingly because JsonbType(jb) is >> returning garbage. So there may be another bug after this one.) > I don't think so. AIUI The first call deals with the '$' and the second > one deals with the '.string()', which is why we see the error on the > second call. There's something else going on, because I'm still getting the assertion failure on my Mac with this fix in place. Annoyingly, it goes away if I compile with -O0, so it's kind of hard to identify what's going wrong. regards, tom lane
I wrote: > There's something else going on, because I'm still getting the > assertion failure on my Mac with this fix in place. Annoyingly, > it goes away if I compile with -O0, so it's kind of hard to > identify what's going wrong. No, belay that: I must've got confused about which version I was testing. It's very unclear to me why the undefined reference causes the preceding Assert to misbehave, but that is clearly what's happening. Compiler bug maybe? My Mac has clang 15.0.0, and the unhappy buildfarm members are also late-model clang. Anyway, I did note that the preceding line res = jperOk; is dead code and might as well get removed while you're at it. regards, tom lane
On 2024-01-25 Th 15:33, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 2024-01-25 Th 14:31, Tom Lane wrote: >>> (The reported crashes seem to be happening later during a >>> recursive invocation, seemingly because JsonbType(jb) is >>> returning garbage. So there may be another bug after this one.) >> I don't think so. AIUI The first call deals with the '$' and the second >> one deals with the '.string()', which is why we see the error on the >> second call. > There's something else going on, because I'm still getting the > assertion failure on my Mac with this fix in place. Annoyingly, > it goes away if I compile with -O0, so it's kind of hard to > identify what's going wrong. > > Curiouser and curiouser. On my Mac the error is manifest but the fix you suggested cures it. Built with -O2 -g, clang 15.0.0, Apple Silicon. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2024-01-25 Th 15:58, Tom Lane wrote: > I wrote: >> There's something else going on, because I'm still getting the >> assertion failure on my Mac with this fix in place. Annoyingly, >> it goes away if I compile with -O0, so it's kind of hard to >> identify what's going wrong. > No, belay that: I must've got confused about which version I was > testing. It's very unclear to me why the undefined reference > causes the preceding Assert to misbehave, but that is clearly > what's happening. Compiler bug maybe? My Mac has clang 15.0.0, > and the unhappy buildfarm members are also late-model clang. > > Anyway, I did note that the preceding line > > res = jperOk; > > is dead code and might as well get removed while you're at it. > > OK, pushed those. Thanks. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2024-01-25 Th 15:58, Tom Lane wrote:
> I wrote:
>> There's something else going on, because I'm still getting the
>> assertion failure on my Mac with this fix in place. Annoyingly,
>> it goes away if I compile with -O0, so it's kind of hard to
>> identify what's going wrong.
> No, belay that: I must've got confused about which version I was
> testing. It's very unclear to me why the undefined reference
> causes the preceding Assert to misbehave, but that is clearly
> what's happening. Compiler bug maybe? My Mac has clang 15.0.0,
> and the unhappy buildfarm members are also late-model clang.
>
> Anyway, I did note that the preceding line
>
> res = jperOk;
>
> is dead code and might as well get removed while you're at it.
>
>
OK, pushed those. Thanks.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
I have two possible issues in a recent commit. Commit 66ea94e8e6 has introduced the following messages: (Aplogizies in advance if the commit is not related to this thread.) jsonpath_exec.c:1287 > if (numeric_is_nan(num) || numeric_is_inf(num)) > RETURN_ERROR(ereport(ERROR, > (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM), > errmsg("numeric argument of jsonpath item method .%s() is out of range for type decimal or number", > jspOperationName(jsp->type))))); :1387 > noerr = DirectInputFunctionCallSafe(numeric_in, numstr, ... > if (!noerr || escontext.error_occurred) > RETURN_ERROR(ereport(ERROR, > (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM), > errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or number", They seem to be suggesting that PostgreSQL has the types "decimal" and "number". I know of the former, but I don't think PostgreSQL has the latter type. Perhaps the "number" was intended to refer to "numeric"? (And I think it is largely helpful if the given string were shown in the error message, but it would be another issue.) The same commit has introduced the following set of messages: > %s format is not recognized: "%s" > date format is not recognized: "%s" > time format is not recognized: "%s" > time_tz format is not recognized: "%s" > timestamp format is not recognized: "%s" > timestamp_tz format is not recognized: "%s" I believe that the first line was intended to cover all the others:p They are attached to this message separately. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > I have two possible issues in a recent commit. > Commit 66ea94e8e6 has introduced the following messages: >> errmsg("numeric argument of jsonpath item method .%s() is out of range for type decimal or number", > They seem to be suggesting that PostgreSQL has the types "decimal" and > "number". I know of the former, but I don't think PostgreSQL has the > latter type. Perhaps the "number" was intended to refer to "numeric"? Probably. But I would write just "type numeric". We do not generally acknowledge "decimal" as a separate type, because for us it's only an alias for numeric (there is not a pg_type entry for it). Also, that leads to the thought that "numeric argument ... is out of range for type numeric" seems either redundant or contradictory depending on how you look at it. So I suggest wording like argument "...input string here..." of jsonpath item method .%s() is out of range for type numeric > (And I think it is largely helpful if the given string were shown in > the error message, but it would be another issue.) Agreed, so I suggest the above. > The same commit has introduced the following set of messages: >> %s format is not recognized: "%s" >> date format is not recognized: "%s" >> time format is not recognized: "%s" >> time_tz format is not recognized: "%s" >> timestamp format is not recognized: "%s" >> timestamp_tz format is not recognized: "%s" > I believe that the first line was intended to cover all the others:p +1 regards, tom lane
At Sun, 28 Jan 2024 22:47:02 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > > They seem to be suggesting that PostgreSQL has the types > "decimal" and > > "number". I know of the former, but I don't think PostgreSQL has > the > > latter type. Perhaps the "number" was intended to refer to > "numeric"? > > Probably. But I would write just "type numeric". We do not > generally > acknowledge "decimal" as a separate type, because for us it's only > an > alias for numeric (there is not a pg_type entry for it). > > Also, that leads to the thought that "numeric argument ... is out of > range for type numeric" seems either redundant or contradictory > depending on how you look at it. So I suggest wording like > > argument "...input string here..." of jsonpath item method .%s() is > out of range for type numeric > > > (And I think it is largely helpful if the given string were shown > in > > the error message, but it would be another issue.) > > Agreed, so I suggest the above. Having said that, I'm a bit concerned about the case where an overly long string is given there. However, considering that we already have "invalid input syntax for type xxx: %x" messages (including for the numeric), this concern might be unnecessary. Another concern is that the input value is already a numeric, not a string. This situation occurs when the input is NaN of +-Inf. Although numeric_out could be used, it might cause another error. Therefore, simply writing out as "argument NaN and Infinity.." would be better. Once we resolve these issues, another question arises regarding on of the messages. In the case of NaN of Infinity, the message will be as the follows: "argument NaN or Infinity of jsonpath item method .%s() is out of range for type numeric" This message appears quite strange and puzzling. I suspect that the intended message was somewhat different. > > The same commit has introduced the following set of messages: > > >> %s format is not recognized: "%s" > >> date format is not recognized: "%s" > >> time format is not recognized: "%s" > >> time_tz format is not recognized: "%s" > >> timestamp format is not recognized: "%s" > >> timestamp_tz format is not recognized: "%s" > > > I believe that the first line was intended to cover all the > others:p > > +1 regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > Having said that, I'm a bit concerned about the case where an overly > long string is given there. However, considering that we already have > "invalid input syntax for type xxx: %x" messages (including for the > numeric), this concern might be unnecessary. Yeah, we have not worried about that in the past. > Another concern is that the input value is already a numeric, not a > string. This situation occurs when the input is NaN of +-Inf. Although > numeric_out could be used, it might cause another error. Therefore, > simply writing out as "argument NaN and Infinity.." would be better. Oh! I'd assumed that we were discussing a string that we'd failed to convert to numeric. If the input is already numeric, then either the error is unreachable or what we're really doing is rejecting special values such as NaN on policy grounds. I would ask first if that policy is sane at all. (I'd lean to "not" --- if we allow it in the input JSON, why not in the output?) If it is sane, the error message needs to be far more specific. regards, tom lane
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> Having said that, I'm a bit concerned about the case where an overly
> long string is given there. However, considering that we already have
> "invalid input syntax for type xxx: %x" messages (including for the
> numeric), this concern might be unnecessary.
Yeah, we have not worried about that in the past.
> Another concern is that the input value is already a numeric, not a
> string. This situation occurs when the input is NaN of +-Inf. Although
> numeric_out could be used, it might cause another error. Therefore,
> simply writing out as "argument NaN and Infinity.." would be better.
Oh! I'd assumed that we were discussing a string that we'd failed to
convert to numeric. If the input is already numeric, then either
the error is unreachable or what we're really doing is rejecting
special values such as NaN on policy grounds. I would ask first
if that policy is sane at all. (I'd lean to "not" --- if we allow
it in the input JSON, why not in the output?) If it is sane, the
error message needs to be far more specific.
regards, tom lane
Agree that the number is not a PostgreSQL type and needs a change. As Tom suggested, we can say "type numeric" here. However, I have seen two variants of error messages here: (1) When the input is numeric and (2) when the input is string. For first, we have error messages like:
numeric argument of jsonpath item method .%s() is out of range for type double precision
string argument of jsonpath item method .%s() is not a valid representation of a double precision number
The second form says "double precision number". So, in the decimal/number case, should we use "numeric number" and then similarly "big integer number"?
What if we use phrases like "for type double precision" at both places, like:
numeric argument of jsonpath item method .%s() is out of range for type double precision
string argument of jsonpath item method .%s() is not a valid representation for type double precision
With this, the rest will be like:
for type numeric
for type bigint
for type integer
Suggestions?
---
Showing input string in the error message:
argument "...input string here..." of jsonpath item method .%s() is out of range for type numeric
If we add the input string in the error, then for some errors, it will be too big, for example:
-ERROR: numeric argument of jsonpath item method .double() is out of range for type double precision
+ERROR: argument "10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" of jsonpath item method .double() is out of range for type double precision
Also, for non-string input, we need to convert numeric to string just for the error message, which seems overkill.
On another note, irrespective of these changes, is it good to show the given input in the error messages? Error messages are logged and may leak some details.
I think the existing way seems ok.
---
NaN and Infinity restrictions:
I am not sure why NaN and Infinity are not allowed in conversion to double precision (.double() method). I have used the same restriction for .decimal() and .number(). However, as you said, we should have error messages more specific. I tried that in the attached patch; please have your views. I have the following wordings for that error message:
"NaN or Infinity is not allowed for jsonpath item method .%s()"
Suggestions...
Attachment
Thank you for the fix! At Tue, 30 Jan 2024 13:46:17 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in > On Mon, Jan 29, 2024 at 11:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > > > Having said that, I'm a bit concerned about the case where an overly > > > long string is given there. However, considering that we already have > > > "invalid input syntax for type xxx: %x" messages (including for the > > > numeric), this concern might be unnecessary. > > > > Yeah, we have not worried about that in the past. > > > > > Another concern is that the input value is already a numeric, not a > > > string. This situation occurs when the input is NaN of +-Inf. Although > > > numeric_out could be used, it might cause another error. Therefore, > > > simply writing out as "argument NaN and Infinity.." would be better. > > > > Oh! I'd assumed that we were discussing a string that we'd failed to > > convert to numeric. If the input is already numeric, then either > > the error is unreachable or what we're really doing is rejecting > > special values such as NaN on policy grounds. I would ask first > > if that policy is sane at all. (I'd lean to "not" --- if we allow > > it in the input JSON, why not in the output?) If it is sane, the > > error message needs to be far more specific. > > > > regards, tom lane > > > > *Consistent error message related to type:* ... > What if we use phrases like "for type double precision" at both places, > like: > numeric argument of jsonpath item method .%s() is out of range for type > double precision > string argument of jsonpath item method .%s() is not a valid representation > for type double precision > > With this, the rest will be like: > for type numeric > for type bigint > for type integer > > Suggestions? FWIW, I prefer consistently using "for type hoge". > --- > > *Showing input string in the error message:* > > argument "...input string here..." of jsonpath item method .%s() is out of > range for type numeric > > If we add the input string in the error, then for some errors, it will be > too big, for example: > > -ERROR: numeric argument of jsonpath item method .double() is out of range > for type double precision > +ERROR: argument > "100000<many zeros follow>" > of jsonpath item method .double() is out of range for type double precision As Tom suggested, given that similar situations have already been disregarded elsewhere, worrying about excessively long input strings in this specific instance won't notably improve safety in total. > Also, for non-string input, we need to convert numeric to string just for > the error message, which seems overkill. As I suggested and you seem to agree, using literally "Nan or Infinity" would be sufficient. > On another note, irrespective of these changes, is it good to show the > given input in the error messages? Error messages are logged and may leak > some details. > > I think the existing way seems ok. In my opinion, it is quite common to include the error-causing value in error messages. Also, we have already many functions that impliy the possibility for revealing input values when converting text representation into internal format, such as with int4in. However, I don't stick to that way. > --- > > *NaN and Infinity restrictions:* > > I am not sure why NaN and Infinity are not allowed in conversion to double > precision (.double() method). I have used the same restriction for > .decimal() and .number(). However, as you said, we should have error > messages more specific. I tried that in the attached patch; please have > your views. I have the following wordings for that error message: > "NaN or Infinity is not allowed for jsonpath item method .%s()" > > Suggestions... They seem good to *me*. By the way, while playing with this feature, I noticed the following error message: > select jsonb_path_query('1.1' , '$.boolean()'); > ERROR: numeric argument of jsonpath item method .boolean() is out of range for type boolean The error message seems a bit off to me. For example, "argument '1.1' is invalid for type [bB]oolean" seems more appropriate for this specific issue. (I'm not ceratin about our policy on the spelling of Boolean..) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > By the way, while playing with this feature, I noticed the following > error message: > > > select jsonb_path_query('1.1' , '$.boolean()'); > > ERROR: numeric argument of jsonpath item method .boolean() is out of range for type boolean > > The error message seems a bit off to me. For example, "argument '1.1' > is invalid for type [bB]oolean" seems more appropriate for this > specific issue. (I'm not ceratin about our policy on the spelling of > Boolean..) Or, following our general convention, it would be spelled as: 'invalid argument for type Boolean: "1.1"' regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Thank you for the fix!
At Tue, 30 Jan 2024 13:46:17 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in
> On Mon, Jan 29, 2024 at 11:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > > Having said that, I'm a bit concerned about the case where an overly
> > > long string is given there. However, considering that we already have
> > > "invalid input syntax for type xxx: %x" messages (including for the
> > > numeric), this concern might be unnecessary.
> >
> > Yeah, we have not worried about that in the past.
> >
> > > Another concern is that the input value is already a numeric, not a
> > > string. This situation occurs when the input is NaN of +-Inf. Although
> > > numeric_out could be used, it might cause another error. Therefore,
> > > simply writing out as "argument NaN and Infinity.." would be better.
> >
> > Oh! I'd assumed that we were discussing a string that we'd failed to
> > convert to numeric. If the input is already numeric, then either
> > the error is unreachable or what we're really doing is rejecting
> > special values such as NaN on policy grounds. I would ask first
> > if that policy is sane at all. (I'd lean to "not" --- if we allow
> > it in the input JSON, why not in the output?) If it is sane, the
> > error message needs to be far more specific.
> >
> > regards, tom lane
> >
>
> *Consistent error message related to type:*
...
> What if we use phrases like "for type double precision" at both places,
> like:
> numeric argument of jsonpath item method .%s() is out of range for type
> double precision
> string argument of jsonpath item method .%s() is not a valid representation
> for type double precision
>
> With this, the rest will be like:
> for type numeric
> for type bigint
> for type integer
>
> Suggestions?
FWIW, I prefer consistently using "for type hoge".
> ---
>
> *Showing input string in the error message:*
>
> argument "...input string here..." of jsonpath item method .%s() is out of
> range for type numeric
>
> If we add the input string in the error, then for some errors, it will be
> too big, for example:
>
> -ERROR: numeric argument of jsonpath item method .double() is out of range
> for type double precision
> +ERROR: argument
> "100000<many zeros follow>"
> of jsonpath item method .double() is out of range for type double precision
As Tom suggested, given that similar situations have already been
disregarded elsewhere, worrying about excessively long input strings
in this specific instance won't notably improve safety in total.
> Also, for non-string input, we need to convert numeric to string just for
> the error message, which seems overkill.
As I suggested and you seem to agree, using literally "Nan or
Infinity" would be sufficient.
This is the case:
select jsonb_path_query('12345678901', '$.integer()');
> On another note, irrespective of these changes, is it good to show the
> given input in the error messages? Error messages are logged and may leak
> some details.
>
> I think the existing way seems ok.
In my opinion, it is quite common to include the error-causing value
in error messages. Also, we have already many functions that impliy
the possibility for revealing input values when converting text
representation into internal format, such as with int4in. However, I
don't stick to that way.
> ---
>
> *NaN and Infinity restrictions:*
>
> I am not sure why NaN and Infinity are not allowed in conversion to double
> precision (.double() method). I have used the same restriction for
> .decimal() and .number(). However, as you said, we should have error
> messages more specific. I tried that in the attached patch; please have
> your views. I have the following wordings for that error message:
> "NaN or Infinity is not allowed for jsonpath item method .%s()"
>
> Suggestions...
They seem good to *me*.
By the way, while playing with this feature, I noticed the following
error message:
> select jsonb_path_query('1.1' , '$.boolean()');
> ERROR: numeric argument of jsonpath item method .boolean() is out of range for type boolean
The error message seems a bit off to me. For example, "argument '1.1'
is invalid for type [bB]oolean" seems more appropriate for this
specific issue. (I'm not ceratin about our policy on the spelling of
Boolean..)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> By the way, while playing with this feature, I noticed the following
> error message:
>
> > select jsonb_path_query('1.1' , '$.boolean()');
> > ERROR: numeric argument of jsonpath item method .boolean() is out of range for type boolean
>
> The error message seems a bit off to me. For example, "argument '1.1'
> is invalid for type [bB]oolean" seems more appropriate for this
> specific issue. (I'm not ceratin about our policy on the spelling of
> Boolean..)
Or, following our general convention, it would be spelled as:
'invalid argument for type Boolean: "1.1"'
or, if we add input value, then
ERROR: argument "1.1" of jsonpath item method .boolean() is invalid for type boolean
And this should work for all the error types, like out of range, not valid, invalid input, etc, etc. Also, we don't need separate error messages for string input as well, which currently has the following form:
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Thu, 1 Feb 2024 09:19:40 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in > > As Tom suggested, given that similar situations have already been > > disregarded elsewhere, worrying about excessively long input strings > > in this specific instance won't notably improve safety in total. > > > > > Also, for non-string input, we need to convert numeric to string just for > > > the error message, which seems overkill. > > > > As I suggested and you seem to agree, using literally "Nan or > > Infinity" would be sufficient. > > > > I am more concerned about .bigint() and .integer(). We can have errors when > the numeric input is out of range, but not NaN or Infinity. At those > places, we need to convert numeric to string to put that value into the > error. > Do you mean we should still put "Nan or Infinity" there? > > This is the case: > select jsonb_path_query('12345678901', '$.integer()'); > ERROR: numeric argument of jsonpath item method .integer() is out of > range for type integer Ah.. Understood. "NaN or Infinity" cannot be used in those cases. Additionally, for jpiBoolean and jpiBigint, we lack the text representation of the value. By a quick grepping, I found that the following functions call numeric_out to convert the jbvNumeric values back into text representation. JsonbValueAstext, populate_scalar, iterate_jsonb_values, executeItemOptUnrwapTarget, jsonb_put_escaped_value The function iterate_jsonb_values(), in particular, iterates over a values array, calling numeric_out on each iteration. The following functions re-converts the converted numeric into another type. jsonb_int[248]() converts the numeric value into int2 using numeric_int[248](). jsonb_float[48]() converts it into float4 using numeric_float[48](). Given these facts, it seems more efficient for jbvNumber to retain the original scalar value, converting it only when necessary. If needed, we could also add a numeric struct member as a cache for better performance. I'm not sure we refer the values more than once, though. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Thu, 1 Feb 2024 09:22:22 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in > On Thu, Feb 1, 2024 at 7:24 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> > wrote: > > > At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi < > > horikyota.ntt@gmail.com> wrote in > > > By the way, while playing with this feature, I noticed the following > > > error message: > > > > > > > select jsonb_path_query('1.1' , '$.boolean()'); > > > > ERROR: numeric argument of jsonpath item method .boolean() is out of > > range for type boolean > > > > > > The error message seems a bit off to me. For example, "argument '1.1' > > > is invalid for type [bB]oolean" seems more appropriate for this > > > specific issue. (I'm not ceratin about our policy on the spelling of > > > Boolean..) > > > > Or, following our general convention, it would be spelled as: > > > > 'invalid argument for type Boolean: "1.1"' > > > > jsonpath way: Hmm. I see. > ERROR: argument of jsonpath item method .boolean() is invalid for type > boolean > > or, if we add input value, then > > ERROR: argument "1.1" of jsonpath item method .boolean() is invalid for > type boolean > > And this should work for all the error types, like out of range, not valid, > invalid input, etc, etc. Also, we don't need separate error messages for > string input as well, which currently has the following form: > > "string argument of jsonpath item method .%s() is not a valid > representation.." Agreed. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Sorry for a minor correction, but.. At Thu, 01 Feb 2024 14:53:57 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > Ah.. Understood. "NaN or Infinity" cannot be used in those > cases. Additionally, for jpiBoolean and jpiBigint, we lack the text > representation of the value. This "Additionally" was merely left in by mistake. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Thu, 1 Feb 2024 09:22:22 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in
> On Thu, Feb 1, 2024 at 7:24 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
>
> > At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <
> > horikyota.ntt@gmail.com> wrote in
> > > By the way, while playing with this feature, I noticed the following
> > > error message:
> > >
> > > > select jsonb_path_query('1.1' , '$.boolean()');
> > > > ERROR: numeric argument of jsonpath item method .boolean() is out of
> > range for type boolean
> > >
> > > The error message seems a bit off to me. For example, "argument '1.1'
> > > is invalid for type [bB]oolean" seems more appropriate for this
> > > specific issue. (I'm not ceratin about our policy on the spelling of
> > > Boolean..)
> >
> > Or, following our general convention, it would be spelled as:
> >
> > 'invalid argument for type Boolean: "1.1"'
> >
>
> jsonpath way:
Hmm. I see.
> ERROR: argument of jsonpath item method .boolean() is invalid for type
> boolean
>
> or, if we add input value, then
>
> ERROR: argument "1.1" of jsonpath item method .boolean() is invalid for
> type boolean
>
> And this should work for all the error types, like out of range, not valid,
> invalid input, etc, etc. Also, we don't need separate error messages for
> string input as well, which currently has the following form:
>
> "string argument of jsonpath item method .%s() is not a valid
> representation.."
Agreed.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment
On Thu, Feb 1, 2024 at 11:25 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Thu, 1 Feb 2024 09:22:22 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in
> On Thu, Feb 1, 2024 at 7:24 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
>
> > At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <
> > horikyota.ntt@gmail.com> wrote in
> > > By the way, while playing with this feature, I noticed the following
> > > error message:
> > >
> > > > select jsonb_path_query('1.1' , '$.boolean()');
> > > > ERROR: numeric argument of jsonpath item method .boolean() is out of
> > range for type boolean
> > >
> > > The error message seems a bit off to me. For example, "argument '1.1'
> > > is invalid for type [bB]oolean" seems more appropriate for this
> > > specific issue. (I'm not ceratin about our policy on the spelling of
> > > Boolean..)
> >
> > Or, following our general convention, it would be spelled as:
> >
> > 'invalid argument for type Boolean: "1.1"'
> >
>
> jsonpath way:
Hmm. I see.
> ERROR: argument of jsonpath item method .boolean() is invalid for type
> boolean
>
> or, if we add input value, then
>
> ERROR: argument "1.1" of jsonpath item method .boolean() is invalid for
> type boolean
>
> And this should work for all the error types, like out of range, not valid,
> invalid input, etc, etc. Also, we don't need separate error messages for
> string input as well, which currently has the following form:
>
> "string argument of jsonpath item method .%s() is not a valid
> representation.."
Agreed.Attached are patches based on the discussion.
Thanks, I combined these and pushed the result.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2024-02-02 Fr 00:31, Jeevan Chalke wrote:On Thu, Feb 1, 2024 at 11:25 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Thu, 1 Feb 2024 09:22:22 +0530, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote in
> On Thu, Feb 1, 2024 at 7:24 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
>
> > At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <
> > horikyota.ntt@gmail.com> wrote in
> > > By the way, while playing with this feature, I noticed the following
> > > error message:
> > >
> > > > select jsonb_path_query('1.1' , '$.boolean()');
> > > > ERROR: numeric argument of jsonpath item method .boolean() is out of
> > range for type boolean
> > >
> > > The error message seems a bit off to me. For example, "argument '1.1'
> > > is invalid for type [bB]oolean" seems more appropriate for this
> > > specific issue. (I'm not ceratin about our policy on the spelling of
> > > Boolean..)
> >
> > Or, following our general convention, it would be spelled as:
> >
> > 'invalid argument for type Boolean: "1.1"'
> >
>
> jsonpath way:
Hmm. I see.
> ERROR: argument of jsonpath item method .boolean() is invalid for type
> boolean
>
> or, if we add input value, then
>
> ERROR: argument "1.1" of jsonpath item method .boolean() is invalid for
> type boolean
>
> And this should work for all the error types, like out of range, not valid,
> invalid input, etc, etc. Also, we don't need separate error messages for
> string input as well, which currently has the following form:
>
> "string argument of jsonpath item method .%s() is not a valid
> representation.."
Agreed.Attached are patches based on the discussion.
Thanks, I combined these and pushed the result.