Thread: jsonpath

jsonpath

From
Andrew Dunstan
Date:
Next up in the proposed SQL/JSON feature set I will review the three
jsonpath patches. These are attached, extracted from the patch set
Nikita posted on Jan 2nd.


Note that they depend on the TZH/TZM patch (see separate email thread).
I'd like to be able to commit that patch pretty soon.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

Re: jsonpath

From
Pavel Stehule
Date:


2018-01-07 20:31 GMT+01:00 Andrew Dunstan <andrew.dunstan@2ndquadrant.com>:

Next up in the proposed SQL/JSON feature set I will review the three
jsonpath patches. These are attached, extracted from the patch set
Nikita posted on Jan 2nd.


Note that they depend on the TZH/TZM patch (see separate email thread).
I'd like to be able to commit that patch pretty soon.

I did few tests - and it is working very well.

regards

Pavel



cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Andrew Dunstan
Date:

On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com <mailto:andrew.dunstan@2ndquadrant.com>>:
>
>
>     Next up in the proposed SQL/JSON feature set I will review the three
>     jsonpath patches. These are attached, extracted from the patch set
>     Nikita posted on Jan 2nd.
>
>
>     Note that they depend on the TZH/TZM patch (see separate email
>     thread).
>     I'd like to be able to commit that patch pretty soon.
>
>
> I did few tests - and it is working very well.
>
>


You mean on the revised TZH/TZM patch that I posted?

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Pavel Stehule
Date:


2018-01-07 21:14 GMT+01:00 Andrew Dunstan <andrew.dunstan@2ndquadrant.com>:


On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com <mailto:andrew.dunstan@2ndquadrant.com>>:
>
>
>     Next up in the proposed SQL/JSON feature set I will review the three
>     jsonpath patches. These are attached, extracted from the patch set
>     Nikita posted on Jan 2nd.
>
>
>     Note that they depend on the TZH/TZM patch (see separate email
>     thread).
>     I'd like to be able to commit that patch pretty soon.
>
>
> I did few tests - and it is working very well.
>
>


You mean on the revised TZH/TZM patch that I posted?

no -  I tested jsonpath implementation

Pavel

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Andrew Dunstan
Date:

On 01/07/2018 03:16 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 21:14 GMT+01:00 Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com <mailto:andrew.dunstan@2ndquadrant.com>>:
>
>
>
>     On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>     >
>     >
>     > 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
>     > <andrew.dunstan@2ndquadrant.com
>     <mailto:andrew.dunstan@2ndquadrant.com>
>     <mailto:andrew.dunstan@2ndquadrant.com
>     <mailto:andrew.dunstan@2ndquadrant.com>>>:
>     >
>     >
>     >     Next up in the proposed SQL/JSON feature set I will review
>     the three
>     >     jsonpath patches. These are attached, extracted from the
>     patch set
>     >     Nikita posted on Jan 2nd.
>     >
>     >
>     >     Note that they depend on the TZH/TZM patch (see separate email
>     >     thread).
>     >     I'd like to be able to commit that patch pretty soon.
>     >
>     >
>     > I did few tests - and it is working very well.
>     >
>     >
>
>
>     You mean on the revised TZH/TZM patch that I posted?
>
>
> no -  I tested jsonpath implementation
>
>


OK. You can help by also reviewing and testing the small patch at
<https://www.postgresql.org/message-id/f16a6408-45fe-b299-02b0-41e50a1c3023%402ndQuadrant.com>

thanks

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Pavel Stehule
Date:


2018-01-07 21:20 GMT+01:00 Andrew Dunstan <andrew.dunstan@2ndquadrant.com>:


On 01/07/2018 03:16 PM, Pavel Stehule wrote:
>
>
> 2018-01-07 21:14 GMT+01:00 Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com <mailto:andrew.dunstan@2ndquadrant.com>>:
>
>
>
>     On 01/07/2018 03:00 PM, Pavel Stehule wrote:
>     >
>     >
>     > 2018-01-07 20:31 GMT+01:00 Andrew Dunstan
>     > <andrew.dunstan@2ndquadrant.com
>     <mailto:andrew.dunstan@2ndquadrant.com>
>     <mailto:andrew.dunstan@2ndquadrant.com
>     <mailto:andrew.dunstan@2ndquadrant.com>>>:
>     >
>     >
>     >     Next up in the proposed SQL/JSON feature set I will review
>     the three
>     >     jsonpath patches. These are attached, extracted from the
>     patch set
>     >     Nikita posted on Jan 2nd.
>     >
>     >
>     >     Note that they depend on the TZH/TZM patch (see separate email
>     >     thread).
>     >     I'd like to be able to commit that patch pretty soon.
>     >
>     >
>     > I did few tests - and it is working very well.
>     >
>     >
>
>
>     You mean on the revised TZH/TZM patch that I posted?
>
>
> no -  I tested jsonpath implementation
>
>


OK. You can help by also reviewing and testing the small patch at
<https://www.postgresql.org/message-id/f16a6408-45fe-b299-02b0-41e50a1c3023%402ndQuadrant.com>

I'll look there

Regards

Pavel


thanks

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Nikita Glukhov
Date:
Attached new 8th version of jsonpath related patches. Complete 
documentation is still missing.

The first 4 small patches are necessary datetime handling in jsonpath:
1. simple refactoring, extracted function that will be used later in 
jsonpath
2. throw an error when the input or format string contains trailing elements
3. avoid unnecessary cstring to text conversions
4. add function for automatic datetime type recognition by the presence 
of formatting components

Should they be posted in a separate thread?

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Andrew Dunstan
Date:

On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
> Attached new 8th version of jsonpath related patches. Complete
> documentation is still missing.
>
> The first 4 small patches are necessary datetime handling in jsonpath:
> 1. simple refactoring, extracted function that will be used later in
> jsonpath
> 2. throw an error when the input or format string contains trailing
> elements
> 3. avoid unnecessary cstring to text conversions
> 4. add function for automatic datetime type recognition by the
> presence of formatting components
>
> Should they be posted in a separate thread?
>


The first of these refactors the json/jsonb timestamp formatting into a
single function, removing a lot of code duplication. The involves
exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
unless there is any objection I propose to commit it shortly.

The next three expose a bit more of the date/time API. I'm still
reviewing those.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Andrew Dunstan
Date:

On 01/15/2018 07:24 PM, Andrew Dunstan wrote:
>
> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>> Attached new 8th version of jsonpath related patches. Complete
>> documentation is still missing.
>>
>> The first 4 small patches are necessary datetime handling in jsonpath:
>> 1. simple refactoring, extracted function that will be used later in
>> jsonpath
>> 2. throw an error when the input or format string contains trailing
>> elements
>> 3. avoid unnecessary cstring to text conversions
>> 4. add function for automatic datetime type recognition by the
>> presence of formatting components
>>
>> Should they be posted in a separate thread?
>>
>
> The first of these refactors the json/jsonb timestamp formatting into a
> single function, removing a lot of code duplication. The involves
> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
> unless there is any objection I propose to commit it shortly.
>
> The next three expose a bit more of the date/time API. I'm still
> reviewing those.
>


I have committed the first of these patches.

I have reviewed the next three, and I think they are generally good.
There is no real point in committing them ahead of the jsonpath patch
since there would be no point in having them at all but for that patch.

Note that these do export the following hitherto internal bits of the
datetime functionality:

    tm2time
    tm2timetz
    AdjustTimeForTypmod
    AdjustTimestampForTypmod

Moving on to review the main jsonpath patch.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Andrew Dunstan
Date:

On 01/17/2018 04:01 PM, Andrew Dunstan wrote:
>
> On 01/15/2018 07:24 PM, Andrew Dunstan wrote:
>> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>>> Attached new 8th version of jsonpath related patches. Complete
>>> documentation is still missing.
>>>
>>> The first 4 small patches are necessary datetime handling in jsonpath:
>>> 1. simple refactoring, extracted function that will be used later in
>>> jsonpath
>>> 2. throw an error when the input or format string contains trailing
>>> elements
>>> 3. avoid unnecessary cstring to text conversions
>>> 4. add function for automatic datetime type recognition by the
>>> presence of formatting components
>>>
>>> Should they be posted in a separate thread?
>>>
>> The first of these refactors the json/jsonb timestamp formatting into a
>> single function, removing a lot of code duplication. The involves
>> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
>> unless there is any objection I propose to commit it shortly.
>>
>> The next three expose a bit more of the date/time API. I'm still
>> reviewing those.
>>
>
> I have committed the first of these patches.
>
> I have reviewed the next three, and I think they are generally good.
> There is no real point in committing them ahead of the jsonpath patch
> since there would be no point in having them at all but for that patch.
>
> Note that these do export the following hitherto internal bits of the
> datetime functionality:
>
>     tm2time
>     tm2timetz
>     AdjustTimeForTypmod
>     AdjustTimestampForTypmod
>
> Moving on to review the main jsonpath patch.
>




OK, I understand a good deal more than I did about how the jsopnpath
code works, but the commenting is abysmal.

I got quite nervous about adding a new datetime variant to JsonbValue.
However, my understanding from the code is that this will only ever be
used in an intermediate jsonpath processing result, and it won't be used
in a stored or build jsonb object. Is that right? If it is we need to
say so, and moreover we need to warn coders in the header file about the
restricted use of this variant.  I'm not sure we can enforce it with an
Assert, but If we can we should. I'm not 100% sure that doing it this
way, i.e. by extending and resuing jsonbValue, is desirable, I'd like to
know some of the thinking behind the design.

The encoding of a jsonpath value into a binary string is quite nifty,
but it needs to be documented. Once I understood it better some of my
fears about parser overhead were alleviated.

The use of a function pointer inside JsonPathVariable seems unnecessary
and baroque. It would be much more straightforward to set an isnull flag
in the struct and process the value in an "if" statement accordingly,
avoiding the function pointer altogether.

I am going to be travelling for a few days, then back online for a day
or two, then gone for a week. I hope we can make progress on these
features, but this needs lots more eyeballs, reviewing code as well as
testing, and lots more responsiveness. The whole suite is enormous.

cheers

andrew




-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Nikita Glukhov
Date:
On 23.01.2018 01:41, Andrew Dunstan wrote:

> On 01/17/2018 04:01 PM, Andrew Dunstan wrote:
>> On 01/15/2018 07:24 PM, Andrew Dunstan wrote:
>>> On 01/10/2018 05:42 PM, Nikita Glukhov wrote:
>>>> Attached new 8th version of jsonpath related patches. Complete
>>>> documentation is still missing.
>>>>
>>>> The first 4 small patches are necessary datetime handling in jsonpath:
>>>> 1. simple refactoring, extracted function that will be used later in
>>>> jsonpath
>>>> 2. throw an error when the input or format string contains trailing
>>>> elements
>>>> 3. avoid unnecessary cstring to text conversions
>>>> 4. add function for automatic datetime type recognition by the
>>>> presence of formatting components
>>>>
>>>> Should they be posted in a separate thread?
>>>>
>>> The first of these refactors the json/jsonb timestamp formatting into a
>>> single function, removing a lot of code duplication. The involves
>>> exposing time2tm() and timetz2tm(). I don't think that's a tragedy, so
>>> unless there is any objection I propose to commit it shortly.
>>>
>>> The next three expose a bit more of the date/time API. I'm still
>>> reviewing those.
>> I have committed the first of these patches.
>>
>> I have reviewed the next three, and I think they are generally good.
>> There is no real point in committing them ahead of the jsonpath patch
>> since there would be no point in having them at all but for that patch.
>>
>> Note that these do export the following hitherto internal bits of the
>> datetime functionality:
>>
>>      tm2time
>>      tm2timetz
>>      AdjustTimeForTypmod
>>      AdjustTimestampForTypmod
>>
>> Moving on to review the main jsonpath patch.
>
> OK, I understand a good deal more than I did about how the jsopnpath
> code works, but the commenting is abysmal.
Thank you for reviewing.

> I got quite nervous about adding a new datetime variant to JsonbValue.
> However, my understanding from the code is that this will only ever be
> used in an intermediate jsonpath processing result, and it won't be used
> in a stored or build jsonb object. Is that right?
Yes, you are right. Datetime JsonbValues are used only for for in-memory 
representation of SQL/JSON datetime items, they converted into ordinary 
JSON strings in ISO format when json/jsonb encoded into a datum.

> If it is we need to say so, and moreover we need to warn coders in the
> header file about the restricted use of this variant.  I'm not sure we
> can enforce it with an Assert, but If we can we should. I'm not 100%
> sure that doing it this way, i.e. by extending and resuing jsonbValue,
> is desirable, I'd like to know some of the thinking behind the design.
Datetime support was added to our jsonpath implementation when there was 
already a lot of code using plain JsonbValue. So, the simplest are most 
effective solution I found was JsonbValue extension. We could also 
introduce extended struct SqlJsonItem, but it seems that there will be a 
lot of unnecessary conversions between SqlJsonItem and JsonbValue.

> The encoding of a jsonpath value into a binary string is quite nifty,
> but it needs to be documented. Once I understood it better some of my
> fears about parser overhead were alleviated.
> The use of a function pointer inside JsonPathVariable seems unnecessary
> and baroque. It would be much more straightforward to set an isnull flag
> in the struct and process the value in an "if" statement accordingly,
> avoiding the function pointer altogether.
Callback in JsonPathVariable is used for on-demand evaluation of 
SQL/JSON PASSING parameters (see EvalJsonPathVar() from patch 
0010-sqljson-v08.patch). For jsonpath itself it is really unnecessary.

> I am going to be travelling for a few days, then back online for a day
> or two, then gone for a week. I hope we can make progress on these
> features, but this needs lots more eyeballs, reviewing code as well as
> testing, and lots more responsiveness. The whole suite is enormous.
>
> cheers
>
> andrew

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 9th version of jsonpath patches rebased onto the latest master.

Jsonpath grammar for parenthesized expressions and predicates was fixed.

Documentation drafts for jsonpath written by Oleg Bartunov:
https://github.com/obartunov/sqljsondoc

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Oleg Bartunov
Date:
On Wed, Feb 14, 2018 at 2:04 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 9th version of jsonpath patches rebased onto the latest master.
>
> Jsonpath grammar for parenthesized expressions and predicates was fixed.
>
> Documentation drafts for jsonpath written by Oleg Bartunov:
> https://github.com/obartunov/sqljsondoc

Direct link is https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
Please consider it as WIP, I will update it in my spare time.

>
>
> --
> Nikita Glukhov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and converted into
    Unknown values in predicates as it is required by the standard:

    =# SELECT jsonb '[1,0,2]' @* '$[*] ? (1 / @ > 0)';
     ?column?
    ----------
     1
     2
    (2 rows)

2. Fixed grammar for parenthesized expressions.

3. Refactored GIN support for jsonpath operators.

4. Added one more operator json[b] @# jsonpath returning singleton json[b] with
    automatic conditional wrapping of sequences with more than one element into
    arrays:

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 2)';
     ?column?
    -----------
     [3, 4, 5]
    (1 row)

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 4)';
     ?column?
    ----------
     5
    (1 row)

    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 5)';
     ?column?
    ----------
     (null)
    (1 row)

    Existing set-returning operator json[b] @* jsonpath is also very userful but
    can't be used in functional indices like new operator @#.

    Note that conditional wrapping of @# differs from the wrapping in
    JSON_QUERY(... WITH [ARRAY] WRAPPER), where only singleton objects and
    arrays are not wrapped.  Unconditional wrapping can be emulated with our
    array construction feature (see below).

5. Changed time zone behavior in .datetime() item method.

    In the previous version of the patch timestamptz SQL/JSON items were
    serialized into JSON string items using session time zone.  This behavior
    did not allow jsonpath operators to be marked as immutable, and therefore
    they could not be used in functional indices.  Also, when the time zone was
    not specified in the input string, but TZM or TZH format fields were present
    in the format string, session time zone was used as a default for
    timestamptz items.

    To make jsonpath operators immutable we decided to save input time zone for
    timestamptz items and disallow automatic time zone assignment.  Also
    additional parameter was added to .datetime() for default time zone
    specification:

    =# SET timezone = '+03';
    SET

    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
    ERROR:  Invalid argument for SQL/JSON datetime function

    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH", "+05")';
              ?column?
    -----------------------------
     "2017-03-10T12:34:56+05:00"
    (1 row)

    =# SELECT jsonb '"10-03-2017 12:34:56 +05"' @*
                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
              ?column?
    -----------------------------
     "2017-03-10T12:34:56+05:00"
    (1 row)

    Please note that our .datetime() behavior is not standard now: by the
    standard, input and format strings must match exactly, i.e. they both should
    not contain trailing unmatched elements, so automatic time zone assignment
    is impossible.  But it too restrictive for PostgreSQL users, so we decided
    to preserve usual PostgreSQL behavior here:

    =# SELECT jsonb '"10-03-2017"' @* '$.datetime("DD-MM-YYYY HH24:MI:SS")';
           ?column?
    -----------------------
     "2017-03-10T00:00:00"
    (1 row)


    Also PostgreSQL is able to automatically recognize format of the input
    string for the specified datetime type, but we can only bring this behavior
    into jsonpath by introducing separate item methods .date(), .time(),
    .timetz(), .timestamp() and .timestamptz().   Also we can use here our
    unfinished feature that gives us ability to work with PostresSQL types in
    jsonpath using cast operator :: (see sqljson_ext branch in our github repo):

    =# SELECT jsonb '"10/03/2017 12:34"' @* '$::timestamptz';
              ?column?
    -----------------------------
     "2017-03-10T12:34:00+03:00"
    (1 row)



A brief description of the extra jsonpath syntax features contained in the
patch #7:

   * Sequence construction by joining path expressions with comma:

     =# SELECT jsonb '[1, 2, 3]' @* '$[*], 4, 5';
      ?column?
     ----------
      1
      2
      3
      4
      5
     (5 rows)

   * Array construction by placing sequence into brackets (equivalent to
     JSON_QUERY(... WITH UNCONDITIONAL WRAPPER)):

     =# SELECT jsonb '[1, 2, 3]' @* '[$[*], 4, 5]';
         ?column?
     -----------------
      [1, 2, 3, 4, 5]
     (1 row)

   * Object construction by placing sequences of key-value pairs into braces:

     =# SELECT jsonb '{"a" : [1, 2, 3]}' @* '{a: [$.a[*], 4, 5], "b c": "dddd"}';
                    ?column?
     ---------------------------------------
      {"a": [1, 2, 3, 4, 5], "b c": "dddd"}
     (1 row)

   * Object subscripting with string-valued expressions:

     =# SELECT jsonb '{"a" : "aaa", "b": "a", "c": "ccc"}' @* '$[$.b, "c"]';
      ?column?
     ----------
      "aaa"
      "ccc"
     (2 rows)

   * Support of UNIX epoch time in .datetime() item method:

     =# SELECT jsonb '1519649957.37' @* '$.datetime()';
                 ?column?
     --------------------------------
      "2018-02-26T12:59:17.37+00:00"
     (1 row)

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: jsonpath

From
Oleg Bartunov
Date:
On Mon, Feb 26, 2018 at 6:34 PM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
>    Now run-time errors in arithmetic operators are catched (added
>    PG_TRY/PG_CATCH around operator's functions calls) and converted into
>    Unknown values in predicates as it is required by the standard:
>
>    =# SELECT jsonb '[1,0,2]' @* '$[*] ? (1 / @ > 0)';
>     ?column?
>    ----------
>     1
>     2
>    (2 rows)
>
> 2. Fixed grammar for parenthesized expressions.
>
> 3. Refactored GIN support for jsonpath operators.
>
> 4. Added one more operator json[b] @# jsonpath returning singleton json[b]
> with
>    automatic conditional wrapping of sequences with more than one element
> into
>    arrays:
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 2)';
>     ?column?
>    -----------
>     [3, 4, 5]
>    (1 row)
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 4)';
>     ?column?
>    ----------
>     5
>    (1 row)
>
>    =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 5)';
>     ?column?
>    ----------
>     (null)
>    (1 row)
>
>    Existing set-returning operator json[b] @* jsonpath is also very userful
> but
>    can't be used in functional indices like new operator @#.
>
>    Note that conditional wrapping of @# differs from the wrapping in
>    JSON_QUERY(... WITH [ARRAY] WRAPPER), where only singleton objects and
>    arrays are not wrapped.  Unconditional wrapping can be emulated with our
>    array construction feature (see below).
>
> 5. Changed time zone behavior in .datetime() item method.
>
>    In the previous version of the patch timestamptz SQL/JSON items were
>    serialized into JSON string items using session time zone.  This behavior
>    did not allow jsonpath operators to be marked as immutable, and therefore
>    they could not be used in functional indices.  Also, when the time zone
> was
>    not specified in the input string, but TZM or TZH format fields were
> present
>    in the format string, session time zone was used as a default for
>    timestamptz items.
>
>    To make jsonpath operators immutable we decided to save input time zone
> for
>    timestamptz items and disallow automatic time zone assignment.  Also
>    additional parameter was added to .datetime() for default time zone
>    specification:
>
>    =# SET timezone = '+03';
>    SET
>
>    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
>    ERROR:  Invalid argument for SQL/JSON datetime function
>
>    =# SELECT jsonb '"10-03-2017 12:34:56"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH", "+05")';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:56+05:00"
>    (1 row)
>
>    =# SELECT jsonb '"10-03-2017 12:34:56 +05"' @*
>                    '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:56+05:00"
>    (1 row)
>
>    Please note that our .datetime() behavior is not standard now: by the
>    standard, input and format strings must match exactly, i.e. they both
> should
>    not contain trailing unmatched elements, so automatic time zone
> assignment
>    is impossible.  But it too restrictive for PostgreSQL users, so we
> decided
>    to preserve usual PostgreSQL behavior here:
>
>    =# SELECT jsonb '"10-03-2017"' @* '$.datetime("DD-MM-YYYY HH24:MI:SS")';
>           ?column?
>    -----------------------
>     "2017-03-10T00:00:00"
>    (1 row)

I think someday we should consider adding support for sql standard
conforming datetime. Since it
breaks postgres behaviour we will need 'standard_conforming_datetime' guc.


>
>
>    Also PostgreSQL is able to automatically recognize format of the input
>    string for the specified datetime type, but we can only bring this
> behavior
>    into jsonpath by introducing separate item methods .date(), .time(),
>    .timetz(), .timestamp() and .timestamptz().   Also we can use here our
>    unfinished feature that gives us ability to work with PostresSQL types in
>    jsonpath using cast operator :: (see sqljson_ext branch in our github
> repo):
>
>    =# SELECT jsonb '"10/03/2017 12:34"' @* '$::timestamptz';
>              ?column?
>    -----------------------------
>     "2017-03-10T12:34:00+03:00"
>    (1 row)

Another note.
We decided to preserve TZ in JSON_QUERY function and follow standard
Postgres behaviour  in JSON_VALUE,
since JSON_QUERY returns JSON object and JSON_VALUE returns SQL value.

SELECT JSON_QUERY(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
         json_query
-----------------------------
 "2018-02-21T17:01:23+05:00"
(1 row)

show timezone;
 TimeZone
----------
 W-SU
(1 row)

SELECT JSON_VALUE(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
       json_value
------------------------
 2018-02-21 15:01:23+03
(1 row)
>
>
>
> A brief description of the extra jsonpath syntax features contained in the
> patch #7:
>
>   * Sequence construction by joining path expressions with comma:
>
>     =# SELECT jsonb '[1, 2, 3]' @* '$[*], 4, 5';
>      ?column?
>     ----------
>      1
>      2
>      3
>      4
>      5
>     (5 rows)
>
>   * Array construction by placing sequence into brackets (equivalent to
>     JSON_QUERY(... WITH UNCONDITIONAL WRAPPER)):
>
>     =# SELECT jsonb '[1, 2, 3]' @* '[$[*], 4, 5]';
>         ?column?
>     -----------------
>      [1, 2, 3, 4, 5]
>     (1 row)
>
>   * Object construction by placing sequences of key-value pairs into braces:
>
>     =# SELECT jsonb '{"a" : [1, 2, 3]}' @* '{a: [$.a[*], 4, 5], "b c":
> "dddd"}';
>                    ?column?
>     ---------------------------------------
>      {"a": [1, 2, 3, 4, 5], "b c": "dddd"}
>     (1 row)
>
>   * Object subscripting with string-valued expressions:
>
>     =# SELECT jsonb '{"a" : "aaa", "b": "a", "c": "ccc"}' @* '$[$.b, "c"]';
>      ?column?
>     ----------
>      "aaa"
>      "ccc"
>     (2 rows)
>
>   * Support of UNIX epoch time in .datetime() item method:
>
>     =# SELECT jsonb '1519649957.37' @* '$.datetime()';
>                 ?column?
>     --------------------------------
>      "2018-02-26T12:59:17.37+00:00"
>     (1 row)
>

Documentation in user-friendly format (it will be convered to xml, of
course) is available
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
We are permanently working on it.



>
> --
> Nikita Glukhov
> Postgres Professional:http://www.postgrespro.com
> The Russian Postgres Company
>


Re: jsonpath

From
Robert Haas
Date:
On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
<n.gluhov@postgrespro.ru> wrote:
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
>    Now run-time errors in arithmetic operators are catched (added
>    PG_TRY/PG_CATCH around operator's functions calls) and converted into
>    Unknown values in predicates as it is required by the standard:

I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: jsonpath

From
Nikita Glukhov
Date:
On 28.02.2018 06:55, Robert Haas wrote:

> On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
> <n.gluhov@postgrespro.ru> wrote:
>> Attached 10th version of the jsonpath patches.
>>
>> 1. Fixed error handling in arithmetic operators.
>>
>>     Now run-time errors in arithmetic operators are catched (added
>>     PG_TRY/PG_CATCH around operator's functions calls) and converted into
>>     Unknown values in predicates as it is required by the standard:
> I think we really need to rename PG_TRY and PG_CATCH or rethink this
> whole interface so that people stop thinking they can use it to
> prevent errors from being thrown.

I understand that it is unsafe to call arbitrary function inside PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the following
numeric and datetime functions with known behavior are called inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()

SQL/JSON standard requires us to handle errors and then perform the specified
ON ERROR behavior.  In the next SQL/JSON patch I had to use subtransactions for
catching errors in JSON_VALUE, JSON_QUERY and JSON_EXISTS where an arbitrary
user-defined typecast function can be called.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
On 28.02.2018 06:55, Robert Haas wrote:

On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
<n.gluhov@postgrespro.ru> wrote:
Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and converted into
    Unknown values in predicates as it is required by the standard:
I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

I understand that it is unsafe to call arbitrary function inside PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the following
numeric and datetime functions with known behavior are called inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()

That seems like a quite limited list of functions.  What about reworking them
providing a way of calling them without risk of exception?  For example, we can
have numeric_add_internal() function which fill given data structure with
error information instead of throwing the error.  numeric_add() would be a
wrapper over numeric_add_internal(), which throws an error if corresponding
data structure is filled.  In jsonpath we can call numeric_add_internal() and
interpret errors in another way.  That seems to be better than use of PG_TRY
and PG_CATCH.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: jsonpath

From
Tomas Vondra
Date:
Hi,

The patch no longer applies - it got broken by fd1a421fe66 which changed
columns in pg_proc. A rebase is needed.

Fixing it is pretty simle, so I've done that locally and tried to run
'make check' under valgrind. And I got a bunch of reports about
uninitialised values. Full report attached, but in general there seem to
be two types of failures:

  Conditional jump or move depends on uninitialised value(s)
     at 0x57BB47A: vfprintf (in /usr/lib64/libc-2.24.so)
     by 0x57E5478: vsnprintf (in /usr/lib64/libc-2.24.so)
     by 0xA926D3: pvsnprintf (psprintf.c:121)
     by 0x723E03: appendStringInfoVA (stringinfo.c:130)
     by 0x723D58: appendStringInfo (stringinfo.c:87)
     by 0x76BEFF: _outCoerceViaIO (outfuncs.c:1413)
     by 0x776F99: outNode (outfuncs.c:3978)
     by 0x76D7E7: _outJsonCoercion (outfuncs.c:1779)
     by 0x777CB9: outNode (outfuncs.c:4398)
     by 0x76D507: _outJsonExpr (outfuncs.c:1752)
     by 0x777CA1: outNode (outfuncs.c:4395)
     by 0x767000: _outList (outfuncs.c:187)
     by 0x776874: outNode (outfuncs.c:3753)
     by 0x76A4D2: _outTableFunc (outfuncs.c:1068)
     by 0x776D89: outNode (outfuncs.c:3912)
     by 0x7744FD: _outRangeTblEntry (outfuncs.c:3209)
     by 0x777959: outNode (outfuncs.c:4290)
     by 0x767000: _outList (outfuncs.c:187)
     by 0x776874: outNode (outfuncs.c:3753)
     by 0x773713: _outQuery (outfuncs.c:3049)
   Uninitialised value was created by a stack allocation
     at 0x5B0C19: base_yyparse (gram.c:26287)

This happens when _outCoerceViaIO tries to output 'location' field
(that's line 1413), so I guess it's not set/copied somewhere.

The second failure looks like this:

  Conditional jump or move depends on uninitialised value(s)
     at 0x49E58B: ginFillScanEntry (ginscan.c:72)
     by 0x49EB56: ginFillScanKey (ginscan.c:221)
     by 0x49EF72: ginNewScanKey (ginscan.c:369)
     by 0x4A3875: gingetbitmap (ginget.c:1807)
     by 0x4F620B: index_getbitmap (indexam.c:727)
     by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
     by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
     by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
     by 0x6DC26D: ExecScanFetch (execScan.c:95)
     by 0x6DC308: ExecScan (execScan.c:162)
     by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6E5961: ExecProcNode (executor.h:239)
     by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
     by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
     by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6D1361: ExecProcNode (executor.h:239)
     by 0x6D3BB7: ExecutePlan (execMain.c:1721)
     by 0x6D1917: standard_ExecutorRun (execMain.c:361)
   Uninitialised value was created by a heap allocation
     at 0xA64FDC: palloc (mcxt.c:858)
     by 0x938636: gin_extract_jsonpath_query (jsonb_gin.c:630)
     by 0x938AB6: gin_extract_jsonb_query (jsonb_gin.c:746)
     by 0xA340C0: FunctionCall7Coll (fmgr.c:1201)
     by 0x49EE7F: ginNewScanKey (ginscan.c:313)
     by 0x4A3875: gingetbitmap (ginget.c:1807)
     by 0x4F620B: index_getbitmap (indexam.c:727)
     by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
     by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
     by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
     by 0x6DC26D: ExecScanFetch (execScan.c:95)
     by 0x6DC308: ExecScan (execScan.c:162)
     by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6E5961: ExecProcNode (executor.h:239)
     by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
     by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
     by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
     by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
     by 0x6D1361: ExecProcNode (executor.h:239)

So the extra_data allocated in gin_extract_jsonpath_query() get to
ginFillScanEntry() uninitialised.


Both seem like a valid issues, I think.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: jsonpath

From
Oleg Bartunov
Date:
Thanks, Tomas !

Will publish a new version really soon !

Regards,
Oleg

On Tue, Mar 6, 2018 at 12:29 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Hi,
>
> The patch no longer applies - it got broken by fd1a421fe66 which changed
> columns in pg_proc. A rebase is needed.
>
> Fixing it is pretty simle, so I've done that locally and tried to run
> 'make check' under valgrind. And I got a bunch of reports about
> uninitialised values. Full report attached, but in general there seem to
> be two types of failures:
>
>   Conditional jump or move depends on uninitialised value(s)
>      at 0x57BB47A: vfprintf (in /usr/lib64/libc-2.24.so)
>      by 0x57E5478: vsnprintf (in /usr/lib64/libc-2.24.so)
>      by 0xA926D3: pvsnprintf (psprintf.c:121)
>      by 0x723E03: appendStringInfoVA (stringinfo.c:130)
>      by 0x723D58: appendStringInfo (stringinfo.c:87)
>      by 0x76BEFF: _outCoerceViaIO (outfuncs.c:1413)
>      by 0x776F99: outNode (outfuncs.c:3978)
>      by 0x76D7E7: _outJsonCoercion (outfuncs.c:1779)
>      by 0x777CB9: outNode (outfuncs.c:4398)
>      by 0x76D507: _outJsonExpr (outfuncs.c:1752)
>      by 0x777CA1: outNode (outfuncs.c:4395)
>      by 0x767000: _outList (outfuncs.c:187)
>      by 0x776874: outNode (outfuncs.c:3753)
>      by 0x76A4D2: _outTableFunc (outfuncs.c:1068)
>      by 0x776D89: outNode (outfuncs.c:3912)
>      by 0x7744FD: _outRangeTblEntry (outfuncs.c:3209)
>      by 0x777959: outNode (outfuncs.c:4290)
>      by 0x767000: _outList (outfuncs.c:187)
>      by 0x776874: outNode (outfuncs.c:3753)
>      by 0x773713: _outQuery (outfuncs.c:3049)
>    Uninitialised value was created by a stack allocation
>      at 0x5B0C19: base_yyparse (gram.c:26287)
>
> This happens when _outCoerceViaIO tries to output 'location' field
> (that's line 1413), so I guess it's not set/copied somewhere.
>
> The second failure looks like this:
>
>   Conditional jump or move depends on uninitialised value(s)
>      at 0x49E58B: ginFillScanEntry (ginscan.c:72)
>      by 0x49EB56: ginFillScanKey (ginscan.c:221)
>      by 0x49EF72: ginNewScanKey (ginscan.c:369)
>      by 0x4A3875: gingetbitmap (ginget.c:1807)
>      by 0x4F620B: index_getbitmap (indexam.c:727)
>      by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>      by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>      by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>      by 0x6DC26D: ExecScanFetch (execScan.c:95)
>      by 0x6DC308: ExecScan (execScan.c:162)
>      by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6E5961: ExecProcNode (executor.h:239)
>      by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>      by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>      by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6D1361: ExecProcNode (executor.h:239)
>      by 0x6D3BB7: ExecutePlan (execMain.c:1721)
>      by 0x6D1917: standard_ExecutorRun (execMain.c:361)
>    Uninitialised value was created by a heap allocation
>      at 0xA64FDC: palloc (mcxt.c:858)
>      by 0x938636: gin_extract_jsonpath_query (jsonb_gin.c:630)
>      by 0x938AB6: gin_extract_jsonb_query (jsonb_gin.c:746)
>      by 0xA340C0: FunctionCall7Coll (fmgr.c:1201)
>      by 0x49EE7F: ginNewScanKey (ginscan.c:313)
>      by 0x4A3875: gingetbitmap (ginget.c:1807)
>      by 0x4F620B: index_getbitmap (indexam.c:727)
>      by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>      by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>      by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>      by 0x6DC26D: ExecScanFetch (execScan.c:95)
>      by 0x6DC308: ExecScan (execScan.c:162)
>      by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6E5961: ExecProcNode (executor.h:239)
>      by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>      by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>      by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>      by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>      by 0x6D1361: ExecProcNode (executor.h:239)
>
> So the extra_data allocated in gin_extract_jsonpath_query() get to
> ginFillScanEntry() uninitialised.
>
>
> Both seem like a valid issues, I think.
>
> regards
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Nikita Glukhov
Date:
On 06.03.2018 00:29, Tomas Vondra wrote:

> Hi,
>
> The patch no longer applies - it got broken by fd1a421fe66 which changed
> columns in pg_proc. A rebase is needed.
>
> Fixing it is pretty simle, so I've done that locally and tried to run
> 'make check' under valgrind. And I got a bunch of reports about
> uninitialised values.

Thank you very much for your discovery. It is my fault that I never ran
PostgreSQL under valgrind. Rebased and fixed patches are attached.

>   Full report attached, but in general there seem to
> be two types of failures:
>
>    Conditional jump or move depends on uninitialised value(s)
>       at 0x57BB47A: vfprintf (in /usr/lib64/libc-2.24.so)
>       by 0x57E5478: vsnprintf (in /usr/lib64/libc-2.24.so)
>       by 0xA926D3: pvsnprintf (psprintf.c:121)
>       by 0x723E03: appendStringInfoVA (stringinfo.c:130)
>       by 0x723D58: appendStringInfo (stringinfo.c:87)
>       by 0x76BEFF: _outCoerceViaIO (outfuncs.c:1413)
>       by 0x776F99: outNode (outfuncs.c:3978)
>       by 0x76D7E7: _outJsonCoercion (outfuncs.c:1779)
>       by 0x777CB9: outNode (outfuncs.c:4398)
>       by 0x76D507: _outJsonExpr (outfuncs.c:1752)
>       by 0x777CA1: outNode (outfuncs.c:4395)
>       by 0x767000: _outList (outfuncs.c:187)
>       by 0x776874: outNode (outfuncs.c:3753)
>       by 0x76A4D2: _outTableFunc (outfuncs.c:1068)
>       by 0x776D89: outNode (outfuncs.c:3912)
>       by 0x7744FD: _outRangeTblEntry (outfuncs.c:3209)
>       by 0x777959: outNode (outfuncs.c:4290)
>       by 0x767000: _outList (outfuncs.c:187)
>       by 0x776874: outNode (outfuncs.c:3753)
>       by 0x773713: _outQuery (outfuncs.c:3049)
>     Uninitialised value was created by a stack allocation
>       at 0x5B0C19: base_yyparse (gram.c:26287)
>
> This happens when _outCoerceViaIO tries to output 'location' field
> (that's line 1413), so I guess it's not set/copied somewhere.

Yes, JSON FORMAT location was not set in gram.y.

> The second failure looks like this:
>
>    Conditional jump or move depends on uninitialised value(s)
>       at 0x49E58B: ginFillScanEntry (ginscan.c:72)
>       by 0x49EB56: ginFillScanKey (ginscan.c:221)
>       by 0x49EF72: ginNewScanKey (ginscan.c:369)
>       by 0x4A3875: gingetbitmap (ginget.c:1807)
>       by 0x4F620B: index_getbitmap (indexam.c:727)
>       by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>       by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>       by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>       by 0x6DC26D: ExecScanFetch (execScan.c:95)
>       by 0x6DC308: ExecScan (execScan.c:162)
>       by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>       by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>       by 0x6E5961: ExecProcNode (executor.h:239)
>       by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>       by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>       by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>       by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>       by 0x6D1361: ExecProcNode (executor.h:239)
>       by 0x6D3BB7: ExecutePlan (execMain.c:1721)
>       by 0x6D1917: standard_ExecutorRun (execMain.c:361)
>     Uninitialised value was created by a heap allocation
>       at 0xA64FDC: palloc (mcxt.c:858)
>       by 0x938636: gin_extract_jsonpath_query (jsonb_gin.c:630)
>       by 0x938AB6: gin_extract_jsonb_query (jsonb_gin.c:746)
>       by 0xA340C0: FunctionCall7Coll (fmgr.c:1201)
>       by 0x49EE7F: ginNewScanKey (ginscan.c:313)
>       by 0x4A3875: gingetbitmap (ginget.c:1807)
>       by 0x4F620B: index_getbitmap (indexam.c:727)
>       by 0x6EE342: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:104)
>       by 0x6DA8F8: MultiExecProcNode (execProcnode.c:506)
>       by 0x6EC53D: BitmapHeapNext (nodeBitmapHeapscan.c:118)
>       by 0x6DC26D: ExecScanFetch (execScan.c:95)
>       by 0x6DC308: ExecScan (execScan.c:162)
>       by 0x6ED7E5: ExecBitmapHeapScan (nodeBitmapHeapscan.c:730)
>       by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>       by 0x6E5961: ExecProcNode (executor.h:239)
>       by 0x6E5E25: fetch_input_tuple (nodeAgg.c:406)
>       by 0x6E8091: agg_retrieve_direct (nodeAgg.c:1736)
>       by 0x6E7C84: ExecAgg (nodeAgg.c:1551)
>       by 0x6DA80A: ExecProcNodeFirst (execProcnode.c:446)
>       by 0x6D1361: ExecProcNode (executor.h:239)
>
> So the extra_data allocated in gin_extract_jsonpath_query() get to
> ginFillScanEntry() uninitialised.

Yes, only the first element of extra_data[] was initialized and used later in
gin_consistent_jsonb(), but ginFillScanEntry() wants all of them to be
initialized.

> Both seem like a valid issues, I think.
>
> regards

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: jsonpath

From
Nikita Glukhov
Date:

On 02.03.2018 00:57, Alexander Korotkov wrote:

On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
On 28.02.2018 06:55, Robert Haas wrote:

On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
<n.gluhov@postgrespro.ru> wrote:
Attached 10th version of the jsonpath patches.

1. Fixed error handling in arithmetic operators.

    Now run-time errors in arithmetic operators are catched (added
    PG_TRY/PG_CATCH around operator's functions calls) and converted into
    Unknown values in predicates as it is required by the standard:
I think we really need to rename PG_TRY and PG_CATCH or rethink this
whole interface so that people stop thinking they can use it to
prevent errors from being thrown.

I understand that it is unsafe to call arbitrary function inside PG_TRY without
rethrowing of caught errors in PG_CATCH, but in jsonpath only the following
numeric and datetime functions with known behavior are called inside PG_TRY
and only errors of category ERRCODE_DATA_EXCEPTION are caught:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()

That seems like a quite limited list of functions.  What about reworking them
providing a way of calling them without risk of exception?  For example, we can
have numeric_add_internal() function which fill given data structure with
error information instead of throwing the error.  numeric_add() would be a
wrapper over numeric_add_internal(), which throws an error if corresponding
data structure is filled.  In jsonpath we can call numeric_add_internal() and
interpret errors in another way.  That seems to be better than use of PG_TRY
and PG_CATCH.
Attached 12th version of jsonpath patches.

I added the 7th patch where the following functions were extracted for 
safe error handling in jsonpath:

numeric_add_internal()
numeric_sub_internal()
numeric_mul_internal()
numeric_div_internal()
numeric_mod_internal()
float8_numeric_internal()
numeric_float8_internal()
float8in_internal()

Errors are passed to caller with new ereport_safe() macro when
ErrorData **edata is not NULL:

+#define ereport_safe(edata, elevel, rest) \
+    do { \
+        if (edata) { \
+            errstart(elevel, __FILE__, __LINE__, PG_FUNCNAME_MACRO, TEXTDOMAIN); \
+            (rest); \
+            *(edata) = CopyErrorData(); \
+            FlushErrorState(); \
+        } else { \
+            ereport(elevel, rest); \
+        } \
+    } while (0)


But to_datetime() is still called in jsonpath inside PG_TRY/PG_CATCH block
because it needs too deep error propagation.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Nikita Glukhov
Date:
Attached 13th version of the jsonpath patches.


Syntax of .** accessor (our extension to standard) was changed to become more
similar to the syntax of the standard array accessor:

.**{2, 5} => .**{2 to 5}
.**{3,}   => .**{3 to last}

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: jsonpath

From
Andrew Dunstan
Date:
On Fri, Mar 2, 2018 at 8:27 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov <n.gluhov@postgrespro.ru>
> wrote:
>>
>> On 28.02.2018 06:55, Robert Haas wrote:
>>
>>> On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
>>> <n.gluhov@postgrespro.ru> wrote:
>>>>
>>>> Attached 10th version of the jsonpath patches.
>>>>
>>>> 1. Fixed error handling in arithmetic operators.
>>>>
>>>>     Now run-time errors in arithmetic operators are catched (added
>>>>     PG_TRY/PG_CATCH around operator's functions calls) and converted
>>>> into
>>>>     Unknown values in predicates as it is required by the standard:
>>>
>>> I think we really need to rename PG_TRY and PG_CATCH or rethink this
>>> whole interface so that people stop thinking they can use it to
>>> prevent errors from being thrown.
>>
>>
>> I understand that it is unsafe to call arbitrary function inside PG_TRY
>> without
>> rethrowing of caught errors in PG_CATCH, but in jsonpath only the
>> following
>> numeric and datetime functions with known behavior are called inside
>> PG_TRY
>> and only errors of category ERRCODE_DATA_EXCEPTION are caught:
>>
>> numeric_add()
>> numeric_mul()
>> numeric_div()
>> numeric_mod()
>> numeric_float8()
>> float8in()
>> float8_numeric()
>> to_datetime()
>
>
> That seems like a quite limited list of functions.  What about reworking
> them
> providing a way of calling them without risk of exception?  For example, we
> can
> have numeric_add_internal() function which fill given data structure with
> error information instead of throwing the error.  numeric_add() would be a
> wrapper over numeric_add_internal(), which throws an error if corresponding
> data structure is filled.  In jsonpath we can call numeric_add_internal()
> and
> interpret errors in another way.  That seems to be better than use of PG_TRY
> and PG_CATCH.
>


I haven't seen a response to this email. Do we need one before
proceeding any further with jsonpath?

cheers

andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Andrew Dunstan
Date:
On Tue, Mar 20, 2018 at 3:36 PM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
> On Fri, Mar 2, 2018 at 8:27 AM, Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
>> On Fri, Mar 2, 2018 at 12:40 AM, Nikita Glukhov <n.gluhov@postgrespro.ru>
>> wrote:
>>>
>>> On 28.02.2018 06:55, Robert Haas wrote:
>>>
>>>> On Mon, Feb 26, 2018 at 10:34 AM, Nikita Glukhov
>>>> <n.gluhov@postgrespro.ru> wrote:
>>>>>
>>>>> Attached 10th version of the jsonpath patches.
>>>>>
>>>>> 1. Fixed error handling in arithmetic operators.
>>>>>
>>>>>     Now run-time errors in arithmetic operators are catched (added
>>>>>     PG_TRY/PG_CATCH around operator's functions calls) and converted
>>>>> into
>>>>>     Unknown values in predicates as it is required by the standard:
>>>>
>>>> I think we really need to rename PG_TRY and PG_CATCH or rethink this
>>>> whole interface so that people stop thinking they can use it to
>>>> prevent errors from being thrown.
>>>
>>>
>>> I understand that it is unsafe to call arbitrary function inside PG_TRY
>>> without
>>> rethrowing of caught errors in PG_CATCH, but in jsonpath only the
>>> following
>>> numeric and datetime functions with known behavior are called inside
>>> PG_TRY
>>> and only errors of category ERRCODE_DATA_EXCEPTION are caught:
>>>
>>> numeric_add()
>>> numeric_mul()
>>> numeric_div()
>>> numeric_mod()
>>> numeric_float8()
>>> float8in()
>>> float8_numeric()
>>> to_datetime()
>>
>>
>> That seems like a quite limited list of functions.  What about reworking
>> them
>> providing a way of calling them without risk of exception?  For example, we
>> can
>> have numeric_add_internal() function which fill given data structure with
>> error information instead of throwing the error.  numeric_add() would be a
>> wrapper over numeric_add_internal(), which throws an error if corresponding
>> data structure is filled.  In jsonpath we can call numeric_add_internal()
>> and
>> interpret errors in another way.  That seems to be better than use of PG_TRY
>> and PG_CATCH.
>>
>
>
> I haven't seen a response to this email. Do we need one before
> proceeding any further with jsonpath?
>


Apologies. I see the reply now.

cheers

andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> That seems like a quite limited list of functions.  What about
>> reworking them providing a way of calling them without risk of
>> exception?

> I haven't seen a response to this email. Do we need one before
> proceeding any further with jsonpath?

I've not been following this thread in detail, but IMO any code anywhere
that thinks that no error can be thrown out of non-straight-line code is
broken beyond redemption.  What, for example, happens if we get ENOMEM
within one of the elog.c functions?

I did look through 0007-jsonpath-arithmetic-error-handling-v12.patch,
and I can't believe that's seriously proposed for commit.  It's making
some pretty fragile changes in error handling, and so far as I can
find there is not even one line of commentary as to what the new
design rules are supposed to be.  Even if it's completely bug-free
today (which I would bet against), how could we keep it so?

            regards, tom lane


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 14th version of the patches:
  * refactored predicates, introduced 3-valued JsonPathBool type instead of
    JsonPathExecResult
  * refactored JsonPathExecResult: now it is typedefed to ErrorData *
  * fixed recursive wildcard accessor (.**) in strict mode: structural errors
    after .** are ignored now

On 20.03.2018 08:36, Tom Lane wrote:

> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>>> That seems like a quite limited list of functions.  What about
>>> reworking them providing a way of calling them without risk of
>>> exception?
>> I haven't seen a response to this email. Do we need one before
>> proceeding any further with jsonpath?
> I've not been following this thread in detail, but IMO any code anywhere
> that thinks that no error can be thrown out of non-straight-line code is
> broken beyond redemption.  What, for example, happens if we get ENOMEM
> within one of the elog.c functions?

Out of memory are intentionally is not caught here, ereport_safe() is primarily
intended only for handling of errors of ERRCODE_DATA_EXCEPTION category (see
explanation below).

> I did look through 0007-jsonpath-arithmetic-error-handling-v12.patch,
> and I can't believe that's seriously proposed for commit.  It's making
> some pretty fragile changes in error handling, and so far as I can
> find there is not even one line of commentary as to what the new
> design rules are supposed to be.  Even if it's completely bug-free
> today (which I would bet against), how could we keep it so?
>
>             regards, tom lane

I basically agree, this experimental patch is optional (see below).



I think I need to clarify the error handling in SQL/JSON one more time here.

SQL/JSON standard requires us to handle errors being thrown from the JSON path
engine and from SQL/JSON item => SQL type coercions, and then perform the
specified ON ERROR behavior.  Standard does not limit handled error categories,
but I think we are interesting here only in errors of category "data exception"
(ERRCODE_DATA_EXCEPTION) in which all possible SQL/JSON, arithmetic, datetime
errors fall.  It might be interesting that such error handling has appeared in
standard only with SQL/JSON introduction.

Example of arithmetic error handling in JSON_VALUE:

=# SELECT JSON_VALUE('0', '1 / $' RETURNING int); -- NULL ON ERROR by default
  json_value
------------
            
(1 row)

=# SELECT JSON_VALUE('0', '1 / $' RETURNING int ERROR ON ERROR);
ERROR:  division by zero

=# SELECT JSON_VALUE('0', '1 / $' RETURNING int DEFAULT -1 ON ERROR);
  json_value
------------
          -1
(1 row)

Error handling in SQL/JSON functions is implemented with PG_TRY/PG_CATCH block
with subtransaction (see ExecEvalJsonExpr() in SQL/JSON patch).  We had to use
subtransactions here because an arbitrary user-defined typecast function can
be called.


One could think that could only support ERROR ON ERROR behavior and have no
problems with PG_TRY/PG_CATCH, but errors should be handled inside jsonpath
predicates (boolean expression used in filters) regardless of ON ERROR behavior.
Any error in predicate operands is converted directly to special Unknown value
(analogue of SQL NULL).

In the following examples arithmetic errors and errors in conversion of string
into double are handled by comparison predicate itself, not by JSON_QUERY:

=# SELECT JSON_QUERY('[1.23, "4.56", 0, "foo"]',
                      '$[*] ? (1 / @.double() > 0)' WITH WRAPPER);
    json_query
----------------
  [1.23, "4.56"]
(1 row)

=# SELECT JSON_QUERY('[1.23, "4.56", 0, "foo"]',
                      '$[*] ? ((1 / @.double() > 0) is unknown)' WITH WRAPPER);
  json_query
------------
  [0, "foo"]
(1 row)


Implementation of handling of arithmetic and datetime errors in patch #4 uses
PG_TRY/PG_CATCH without surrounding substransactions, because only the limited
set of built-in functions with known behavior is called inside this block:

numeric_add()
numeric_mul()
numeric_div()
numeric_mod()
numeric_float8()
float8in()
float8_numeric()
to_datetime()


The 7th optional patch is attempt to completely get rid of PG_TRY/PG_CATCH in
jsonpath by the idea of Alexander Korotkov: additional parameter
"ErrorData **edata" is passed to internal variants of those functions for
saving error info into it instead of throwing it.  But I agree that this
solution is far from ideal and the assertion that some functions should not
throw exceptions of some category can easily be broken.  Moreover, I have done
it only for numeric functions, because to_datetime() function needs too deep
propagation of ErrorData passing.


If your see other variants of implementation of error handling in jsonpath,
please tell us about them.  Now we can offer only two variants described above,
but we understand that some future work is necessary to support such standard
features in PostgreSQL without having problems with PG_TRY/PG_CATCH.


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: jsonpath

From
David Steele
Date:
On 3/27/18 11:13 AM, Nikita Glukhov wrote:
>
> Attached 14th version of the patches:

It appears this entry should be marked Needs Review, so I have done that
and moved it to the next CF.

Regards,
-- 
-David
david@pgmasters.net


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 15th version of the patches.

There are changes only in jsonpath lexer:
  * added single-quoted strings
  * added escape sequences \v, \', \xXX, \u{...}

It's all required by ECMAScript standard (ECMA-262), from which SQL/JSON
standard adopts lexical rules.


Also, the documentation has been moved into a separate patch
(see https://www.postgresql.org/message-id/732208d3-56c3-25a4-8f08-3be1d54ad51b%40postgrespro.ru).

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Attachment

Re: jsonpath

From
Thomas Munro
Date:
On Thu, Jun 28, 2018 at 11:38 AM, Nikita Glukhov
<n.gluhov@postgrespro.ru> wrote:
> Attached 15th version of the patches.

Hi Nikita,

I wonder why the Windows build scripts are not finding and processing
your new .y and .l files:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.3672

Perhaps Mkvcbuild.pm needs to be told about them?  For example, it has this:

    $postgres->AddFiles('src/backend/parser', 'scan.l', 'gram.y');

PS If you need a way to test on Windows but don't have a local Windows
build system, this may be useful:

https://wiki.postgresql.org/wiki/Continuous_Integration

-- 
Thomas Munro
http://www.enterprisedb.com


Re: jsonpath

From
Nikita Glukhov
Date:
On 28.06.2018 05:39, Thomas Munro wrote:

> On Thu, Jun 28, 2018 at 11:38 AM, Nikita Glukhov
> <n.gluhov@postgrespro.ru> wrote:
>> Attached 15th version of the patches.
> Hi Nikita,
>
> I wonder why the Windows build scripts are not finding and processing
> your new .y and .l files:
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.3672
>
> Perhaps Mkvcbuild.pm needs to be told about them?  For example, it has this:
>
>      $postgres->AddFiles('src/backend/parser', 'scan.l', 'gram.y');
Yes, generation of jsonpath_gram.* was missing in Mkvcbuild.pm.

I have fixed it in the new 16th version of patches, and now it seems to 
work on
Windows (https://ci.appveyor.com/project/NikitaGlukhov/postgres/build/1.0.4)

> PS If you need a way to test on Windows but don't have a local Windows
> build system, this may be useful:
>
> https://wiki.postgresql.org/wiki/Continuous_Integration

Thank, this is really helpful.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Nikita Glukhov
Date:
Attached 17th version of the patches rebased onto the current master.

Nothing significant has changed.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Stas Kelvich
Date:

> On 23 Aug 2018, at 00:29, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> Attached 17th version of the patches rebased onto the current master.
>
> Nothing significant has changed.

Hey.

I’ve looked through presented implementation of jsonpath and have some remarks:


1. Current patch set adds functions named jsonpath_* that implementing subset of functionality
of corresponding json_* functions which doesn’t require changes to postgres SQL grammar.
If we are going to commit json_* functions it will be quite strange for end user: two sets of functions
with mostly identical behaviour, except that one of them allows customise error handling and pass variable via
special language constructions. I suggest only leave operators in this patch, but remove functions.


2. Handling of Unknown value differs from one that described in tech report: filter '(@.k == 1)’
doesn’t evaluates to Unknown. Consider following example:

select '42'::json @* '$ ? ( (@.k == 1) is unknown )';

As per my understanding of standard this should show 42. Seems that it was evaluated to False instead,
because boolean operators (! || &&) on filter expression with structural error behave like it is False.


3. .keyvalue() returns object with field named ‘key’ instead of ’name’ as per tech report. ‘key’ field seems to
be more consistent with function name, but i’m not sure it is worths of mismatch with standard. Also ‘id’ field
is omitted, making it harder to use something like GROUP BY afterwards.


4. Looks like jsonpath executor lacks some CHECK_FOR_INTERRUPTS() during hot paths. Backend with
following query is unresponsive to signals:

select count(*) from (
    select '[0,1,2,3,4,5,6,7,8,9]'::json @* longpath::jsonpath from (
        select '$[' || string_agg(subscripts, ',') ||']' as longpath from (
            select 'last,1' as subscripts from generate_series(1,1000000)
        ) subscripts_q
    ) jpath_q
) count_q;


5. Files generated by lex/bison should be listed in .gitignore files in corresponding directories.


6. My compiler complains about unused functions: JsonValueListConcat, JsonValueListClear.


7. Presented patch files structure is somewhat complicated with patches to patches. I've melded them
down to following patches:

0001: three first patches with preliminary datetime infrastructure
0002: Jsonpath engine and operators that is your previous 4+6+7
0003: Jsonpath extensions is your previous 8+9
0004: GIN support is your 5th path

Also this patches were formed with 'git format-patch', so one can apply all of them with 'git apply'
restoring each one as commit.


8. Also sending jsonpath_check.sql with queries which I used to check compliance with the standard.
That can be added to regression test, if you want.





-- 
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Attachment

Re: jsonpath

From
Nikita Glukhov
Date:
Attached 18th version of the patches rebased onto the current master.


On 04.09.2018 17:15, Stas Kelvich wrote:
On 23 Aug 2018, at 00:29, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

Attached 17th version of the patches rebased onto the current master.

Nothing significant has changed.
Hey.

I’ve looked through presented implementation of jsonpath and have some remarks:
Thank you for your review.

1. Current patch set adds functions named jsonpath_* that implementing subset of functionality
of corresponding json_* functions which doesn’t require changes to postgres SQL grammar.
If we are going to commit json_* functions it will be quite strange for end user: two sets of functions
with mostly identical behaviour, except that one of them allows customise error handling and pass variable via
special language constructions. I suggest only leave operators in this patch, but remove functions.
We can't simply remove these functions because they are the implementation 
of operators.

2. Handling of Unknown value differs from one that described in tech report: filter '(@.k == 1)’
doesn’t evaluates to Unknown. Consider following example:

select '42'::json @* '$ ? ( (@.k == 1) is unknown )';

As per my understanding of standard this should show 42. Seems that it was evaluated to False instead,
because boolean operators (! || &&) on filter expression with structural error behave like it is False.
Yes, this example should return 42, but only in the strict mode.  In the lax 
mode, which is default, structural errors are ignored, so '@.k' evaluates to
an empty sequence, and '@.k == 1' then returns False.

3. .keyvalue() returns object with field named ‘key’ instead of ’name’ 
as per tech report. ‘key’ field seems to be more consistent with function 
name, but i’m not sure it is worths of mismatch with standard. Also ‘id’
field is omitted, making it harder to use something like GROUP BY afterwards.
SQL/JSON standard uses "key" as a field name (9.39, page 716):vi) For all h, 1 <= h <= q, let OBJh be an SQL/JSON object with three members:  1) The first member has key "key" and bound value Ki .  2) The second member has key "value”"and bound value BVi.  3) The third member has key "id" and bound value IDj.
But "id" field was really missing in our implementation.  I have added it
using byte offset of the object in jsonb as its identifier.

4. Looks like jsonpath executor lacks some CHECK_FOR_INTERRUPTS() during hot paths. Backend with
following query is unresponsive to signals:

select count(*) from (   select '[0,1,2,3,4,5,6,7,8,9]'::json @* longpath::jsonpath from (       select '$[' || string_agg(subscripts, ',') ||']' as longpath from (           select 'last,1' as subscripts from generate_series(1,1000000)       ) subscripts_q   ) jpath_q
) count_q;
Fixed: added CHECK_FOR_INTERRUPTS() to jsonpath parser and executor.
5. Files generated by lex/bison should be listed in .gitignore files in corresponding directories.
Fixed.
6. My compiler complains about unused functions: JsonValueListConcat, JsonValueListClear.
Fixed: these functions used only in the next SQL/JSON patches were removed.
7. Presented patch files structure is somewhat complicated with patches to patches. I've melded them
down to following patches:

0001: three first patches with preliminary datetime infrastructure
0002: Jsonpath engine and operators that is your previous 4+6+7
0003: Jsonpath extensions is your previous 8+9
0004: GIN support is your 5th path

Also this patches were formed with 'git format-patch', so one can apply all of them with 'git apply'
restoring each one as commit.
New patch set is formed with 'git format-patch', but I can still supply patches
in the previous split form.
8. Also sending jsonpath_check.sql with queries which I used to check compliance with the standard.
That can be added to regression test, if you want.
I think these additional tests can be added, but I have not yet done it in this
version of the patches.


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company   
   
Attachment

Re: jsonpath

From
Michael Paquier
Date:
On Sat, Sep 08, 2018 at 02:21:27AM +0300, Nikita Glukhov wrote:
> Attached 18th version of the patches rebased onto the current master.

Nikita, this version fails to apply, as 0004 has conflicts with some
regression tests.  Could you rebase?  I am moving the patch to CF
2018-11, waiting for your input.
--
Michael

Attachment

Re: jsonpath

From
Tomas Vondra
Date:
Hi,

On 10/02/2018 04:33 AM, Michael Paquier wrote:
> On Sat, Sep 08, 2018 at 02:21:27AM +0300, Nikita Glukhov wrote:
>> Attached 18th version of the patches rebased onto the current master.
> 
> Nikita, this version fails to apply, as 0004 has conflicts with some
> regression tests.  Could you rebase?  I am moving the patch to CF
> 2018-11, waiting for your input.
> --
> Michael
> 

As Michael mentioned, the patch does not apply anymore, so it would be
good to provide a rebased version for CF 2018-11. I've managed to do
that, as the issues are due to minor bitrot, so that I can do some quick
review of the current version.

I haven't done much testing, pretty much just compiling, running the
usual regression tests and reading through the patches. I plan to do
more testing once the rebased patch is submitted.

Now, a couple of comments based on eye-balling the diffs.


1) There are no docs, which makes it pretty much non-committable for
now. I know there is [1] and it was a good intro for the review, but we
need something as a part of our sgml docs.


2) 0001 says this:

    *typmod = -1; /* TODO implement FF1, ..., FF9 */

but I have no idea what FF1 or FF9 are. I guess it needs to be
implemented, or explained better.


3) The makefile rule for scan.o does this:

+# Latest flex causes warnings in this file.
+ifeq ($(GCC),yes)
+scan.o: CFLAGS += -Wno-error
+endif

That seems a bit ugly, and we should probably try to make it work with
the latest flex, instead of hiding the warnings. I don't think we have
any such ad-hoc rules in other Makefiles. If we really need it, can't we
make it part of configure, and/or restrict it depending on flex version?


4) There probably should be .gitignore rule for jsonpath_gram.h, just
like for other generated header files.


5) jbvType says jbvDatetime is "virtual type" but does not explain what
it is. IMHO that deserves a comment or something.


6) I see the JsonPath definition says this about header:

    /* just version, other bits are reservedfor future use */

but the very next thing it does is defining two pieces stored in the
header - version AND "lax" mode flag. Which makes the comment invalid
(also, note the missing space after "reserved").

FWIW, I'd use JSONPATH_STRICT instead of JSONPATH_LAX. The rest of the
codebase works with "strict" flags passed around, and it's easy to
forget to negate the flag somewhere (at least that's my experience).


7) I see src/include/utils/jsonpath_json.h adds support for plain json
by undefining various jsonb macros and redirecting them to the json
variants. I find that rather suspicious - imagine you're investigating
something in code using those jsonb macros, and wondering why it ends up
calling the json stuff. I'd be pretty confused ...

Also, some of the redefinitions are not really needed for example
JsonbWrapItemInArray and JsonbWrapItemsInArray are not used anywhere
(and neither are the json variants).


8) I see 0002 defines IsAJsonbScalar like this:

  #define IsAJsonbScalar(jsonbval)  (((jsonbval)->type >= jbvNull && \
                                     (jsonbval)->type <= jbvBool) || \
                                     (jsonbval)->type == jbvDatetime)

while 0004 does this

  #define jspIsScalar(type) ((type) >= jpiNull && (type) <= jpiBool)

I know those are for different data types (jsonb vs. jsonpath), but I
suppose jspIsScalar should include the datetime too.

FWIW JsonPathItemType would deserve better documentation what the
various items mean (a comment for each line would be enough). I've been
wondering if "jpiDouble" means scalar double value until I realized
there's a ".double()" function for paths.


9) It's generally a good idea to make the individual pieces committable
separately, but that means e.g. the regression tests have to pass after
each patch. At the moment that does not seem to be the case for 0002,
see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
not sure if that's related.

regards


[1] https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

Re: jsonpath

From
Oleg Bartunov
Date:
On Mon, Oct 29, 2018 at 2:20 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> Hi,
>
> On 10/02/2018 04:33 AM, Michael Paquier wrote:
> > On Sat, Sep 08, 2018 at 02:21:27AM +0300, Nikita Glukhov wrote:
> >> Attached 18th version of the patches rebased onto the current master.
> >
> > Nikita, this version fails to apply, as 0004 has conflicts with some
> > regression tests.  Could you rebase?  I am moving the patch to CF
> > 2018-11, waiting for your input.
> > --
> > Michael
> >
>
> As Michael mentioned, the patch does not apply anymore, so it would be
> good to provide a rebased version for CF 2018-11. I've managed to do
> that, as the issues are due to minor bitrot, so that I can do some quick
> review of the current version.
>
> I haven't done much testing, pretty much just compiling, running the
> usual regression tests and reading through the patches. I plan to do
> more testing once the rebased patch is submitted.
>
> Now, a couple of comments based on eye-balling the diffs.
>
>
> 1) There are no docs, which makes it pretty much non-committable for
> now. I know there is [1] and it was a good intro for the review, but we
> need something as a part of our sgml docs.

SQL/JSON documentation discussed in separate topic
https://www.postgresql.org/message-id/flat/732208d3-56c3-25a4-8f08-3be1d54ad51b%40postgrespro.ru


>
>
> 2) 0001 says this:
>
>     *typmod = -1; /* TODO implement FF1, ..., FF9 */
>
> but I have no idea what FF1 or FF9 are. I guess it needs to be
> implemented, or explained better.
>
>
> 3) The makefile rule for scan.o does this:
>
> +# Latest flex causes warnings in this file.
> +ifeq ($(GCC),yes)
> +scan.o: CFLAGS += -Wno-error
> +endif
>
> That seems a bit ugly, and we should probably try to make it work with
> the latest flex, instead of hiding the warnings. I don't think we have
> any such ad-hoc rules in other Makefiles. If we really need it, can't we
> make it part of configure, and/or restrict it depending on flex version?
>
>
> 4) There probably should be .gitignore rule for jsonpath_gram.h, just
> like for other generated header files.
>
>
> 5) jbvType says jbvDatetime is "virtual type" but does not explain what
> it is. IMHO that deserves a comment or something.
>
>
> 6) I see the JsonPath definition says this about header:
>
>     /* just version, other bits are reservedfor future use */
>
> but the very next thing it does is defining two pieces stored in the
> header - version AND "lax" mode flag. Which makes the comment invalid
> (also, note the missing space after "reserved").
>
> FWIW, I'd use JSONPATH_STRICT instead of JSONPATH_LAX. The rest of the
> codebase works with "strict" flags passed around, and it's easy to
> forget to negate the flag somewhere (at least that's my experience).
>
>
> 7) I see src/include/utils/jsonpath_json.h adds support for plain json
> by undefining various jsonb macros and redirecting them to the json
> variants. I find that rather suspicious - imagine you're investigating
> something in code using those jsonb macros, and wondering why it ends up
> calling the json stuff. I'd be pretty confused ...
>
> Also, some of the redefinitions are not really needed for example
> JsonbWrapItemInArray and JsonbWrapItemsInArray are not used anywhere
> (and neither are the json variants).
>
>
> 8) I see 0002 defines IsAJsonbScalar like this:
>
>   #define IsAJsonbScalar(jsonbval)  (((jsonbval)->type >= jbvNull && \
>                                      (jsonbval)->type <= jbvBool) || \
>                                      (jsonbval)->type == jbvDatetime)
>
> while 0004 does this
>
>   #define jspIsScalar(type) ((type) >= jpiNull && (type) <= jpiBool)
>
> I know those are for different data types (jsonb vs. jsonpath), but I
> suppose jspIsScalar should include the datetime too.
>
> FWIW JsonPathItemType would deserve better documentation what the
> various items mean (a comment for each line would be enough). I've been
> wondering if "jpiDouble" means scalar double value until I realized
> there's a ".double()" function for paths.
>
>
> 9) It's generally a good idea to make the individual pieces committable
> separately, but that means e.g. the regression tests have to pass after
> each patch. At the moment that does not seem to be the case for 0002,
> see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
> not sure if that's related.
>
> regards
>
>
> [1] https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
>
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
On 29.10.2018 2:20, Tomas Vondra wrote:
On 10/02/2018 04:33 AM, Michael Paquier wrote:
On Sat, Sep 08, 2018 at 02:21:27AM +0300, Nikita Glukhov wrote:
Attached 18th version of the patches rebased onto the current master.
Nikita, this version fails to apply, as 0004 has conflicts with some
regression tests.  Could you rebase?  I am moving the patch to CF
2018-11, waiting for your input.
--
Michael

As Michael mentioned, the patch does not apply anymore, so it would be
good to provide a rebased version for CF 2018-11. I've managed to do
that, as the issues are due to minor bitrot, so that I can do some quick
review of the current version.
Sorry that I failed to provide rebased version earlier. 

Attached 19th version of the patches rebased onto the current master.
I haven't done much testing, pretty much just compiling, running the
usual regression tests and reading through the patches. I plan to do
more testing once the rebased patch is submitted.

Now, a couple of comments based on eye-balling the diffs.
Thank you for your review.
1) There are no docs, which makes it pretty much non-committable for
now. I know there is [1] and it was a good intro for the review, but we
need something as a part of our sgml docs.
I think that jsonpath part of documentation can be extracted from [2] and
added to the patch set.

2) 0001 says this:
   *typmod = -1; /* TODO implement FF1, ..., FF9 */

but I have no idea what FF1 or FF9 are. I guess it needs to be
implemented, or explained better.
FF1-FF9 are standard datetime template fields used for specifying of fractional
seconds.  FF3/FF6 are analogues of our MS/US.  I decided simply to implement 
this feature (see patch 0001, I also can supply it in the separate patch).

But FF7-FF9 are questionable since the maximal supported precision is only 6.
They are optional by the standard:
 95) Specifications for Feature F555, “Enhanced seconds precision”:   d) Subclause 9.44, “Datetime templates”:     i) Without Feature F555, “Enhanced seconds precision”,        a <datetime template fraction> shall not be FF7, FF8 or FF9.

So I decided to allow FF7-FF9 only on the output in to_char().

3) The makefile rule for scan.o does this:

+# Latest flex causes warnings in this file.
+ifeq ($(GCC),yes)
+scan.o: CFLAGS += -Wno-error
+endif

That seems a bit ugly, and we should probably try to make it work with
the latest flex, instead of hiding the warnings. I don't think we have
any such ad-hoc rules in other Makefiles. If we really need it, can't we
make it part of configure, and/or restrict it depending on flex version?
These lines seem to belong to the earliest versions of our jsonpath
implementation. There is no scan.o file now at all, there is only 
jsonpath_scan.o, so I simply removed these lines.

4) There probably should be .gitignore rule for jsonpath_gram.h, just
like for other generated header files.
I see 3 rules in /src/backend/utils/adt/.gitignore:
/jsonpath_gram.h
/jsonpath_gram.c
/jsonpath_scan.c
5) jbvType says jbvDatetime is "virtual type" but does not explain what
it is. IMHO that deserves a comment or something.
"Virtual type" means here that it is used only for in-memory processing and 
converted into JSON string when outputted to jsonb.  Corresponding comment 
was added.

6) I see the JsonPath definition says this about header:
   /* just version, other bits are reservedfor future use */

but the very next thing it does is defining two pieces stored in the
header - version AND "lax" mode flag. Which makes the comment invalid
(also, note the missing space after "reserved").
Fixed.
FWIW, I'd use JSONPATH_STRICT instead of JSONPATH_LAX. The rest of the
codebase works with "strict" flags passed around, and it's easy to
forget to negate the flag somewhere (at least that's my experience).
Jsonpath lax/strict mode flag is used only in executeJsonPath() where it is 
saved  in "laxMode" field.  New "strict" flag passed to datetime functions 
is unrelated to jsonpath.

7) I see src/include/utils/jsonpath_json.h adds support for plain json
by undefining various jsonb macros and redirecting them to the json
variants. I find that rather suspicious - imagine you're investigating
something in code using those jsonb macros, and wondering why it ends up
calling the json stuff. I'd be pretty confused ...
I agree, this is rather simple but doubtful solution.  That's why json support
was in a separate patch until the 18th version of the patches.

But if we do not want to compile jsonpath.c twice with different definitions,
then we need some kind of run-time wrapping over json strings and jsonb
containers, which seems a bit harder to implement.

To simplify debugging I can also suggest to explicitly preprocess jsonpath.c
into jsonpath_json.c using redefinitions from jsonpath_json.h before its
compilation.
Also, some of the redefinitions are not really needed for example
JsonbWrapItemInArray and JsonbWrapItemsInArray are not used anywhere
(and neither are the json variants).
These definitions will be used in the next patches, so I removed them 
for now from this patch.
8) I see 0002 defines IsAJsonbScalar like this:
 #define IsAJsonbScalar(jsonbval)  (((jsonbval)->type >= jbvNull && \                                    (jsonbval)->type <= jbvBool) || \                                    (jsonbval)->type == jbvDatetime)

while 0004 does this
 #define jspIsScalar(type) ((type) >= jpiNull && (type) <= jpiBool)

I know those are for different data types (jsonb vs. jsonpath), but I
suppose jspIsScalar should include the datetime too.
jpiDatetime does not mean the same thing as jbvDatetime.
jpiDatetime is a representation of ".datetime()" item method.
jbvDatetime is a representation of datetime SQL/JSON items.

Also datetime SQL/JSON items can be represented in JSON path only by strings
with ".datetime()" method applied, they do not have their own literal:
'"2018-01-10".datetime()'   
FWIW JsonPathItemType would deserve better documentation what the
various items mean (a comment for each line would be enough). I've been
wondering if "jpiDouble" means scalar double value until I realized
there's a ".double()" function for paths.
I added per-item comments.

9) It's generally a good idea to make the individual pieces committable
separately, but that means e.g. the regression tests have to pass after
each patch. At the moment that does not seem to be the case for 0002,
see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
not sure if that's related.
This should definitely be a bug in json support, but I can't reproduce 
it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide 
a stack trace at least?


[1] https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
[2] https://github.com/postgrespro/sqljson/tree/sqljson_doc

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
   
Attachment

Re: jsonpath

From
Tomas Vondra
Date:
On 11/6/18 3:31 PM, Nikita Glukhov wrote:
> On 29.10.2018 2:20, Tomas Vondra wrote:>
 >
 > ...>
> Thank you for your review.
> 
>> 1) There are no docs, which makes it pretty much non-committable for
>> now. I know there is [1] and it was a good intro for the review, but we
>> need something as a part of our sgml docs.
> 
> I think that jsonpath part of documentation can be extracted from [2] and
> added to the patch set.
> 

Yes, please let's do that. The patch could not get into RFC without 
docs, so let's deal with it now.

>> 2) 0001 says this:
>>
>>      *typmod = -1; /* TODO implement FF1, ..., FF9 */
>>
>> but I have no idea what FF1 or FF9 are. I guess it needs to be
>> implemented, or explained better.
> 
> FF1-FF9 are standard datetime template fields used for specifying of fractional
> seconds.  FF3/FF6 are analogues of our MS/US.  I decided simply to implement
> this feature (see patch 0001, I also can supply it in the separate patch).
> 

Understood. Thanks for the explanation.

> But FF7-FF9 are questionable since the maximal supported precision is only 6.
> They are optional by the standard:
> 
>    95) Specifications for Feature F555, “Enhanced seconds precision”:
>      d) Subclause 9.44, “Datetime templates”:
>        i) Without Feature F555, “Enhanced seconds precision”,
>           a <datetime template fraction> shall not be FF7, FF8 or FF9.
> 
> So I decided to allow FF7-FF9 only on the output in to_char().
> 

Hmmmm, isn't that against the standard then? I mean, if our precision is 
only 6, that probably means we don't have the F555 feature, which means 
FF7-9 should not be available.

It also seems like a bit surprising behavior that we only allow those 
fields in to_char() and not in other places.

>> 4) There probably should be .gitignore rule for jsonpath_gram.h, just
>> like for other generated header files.
> 
> I see 3 rules in /src/backend/utils/adt/.gitignore:
> 
> /jsonpath_gram.h
> /jsonpath_gram.c
> /jsonpath_scan.c
> 

But there's a symlink in src/include/utils/jsonpath_gram.h and that's 
not in .gitignore.

>> FWIW, I'd use JSONPATH_STRICT instead of JSONPATH_LAX. The rest of the
>> codebase works with "strict" flags passed around, and it's easy to
>> forget to negate the flag somewhere (at least that's my experience).
> 
> Jsonpath lax/strict mode flag is used only in executeJsonPath() where it is
> saved  in "laxMode" field.  New "strict" flag passed to datetime functions
> is unrelated to jsonpath.
> 

OK.

>> 7) I see src/include/utils/jsonpath_json.h adds support for plain json
>> by undefining various jsonb macros and redirecting them to the json
>> variants. I find that rather suspicious - imagine you're investigating
>> something in code using those jsonb macros, and wondering why it ends up
>> calling the json stuff. I'd be pretty confused ...
> 
> I agree, this is rather simple but doubtful solution.  That's why json support
> was in a separate patch until the 18th version of the patches.
> 
> But if we do not want to compile jsonpath.c twice with different definitions,
> then we need some kind of run-time wrapping over json strings and jsonb
> containers, which seems a bit harder to implement.
> 
> To simplify debugging I can also suggest to explicitly preprocess jsonpath.c
> into jsonpath_json.c using redefinitions from jsonpath_json.h before its
> compilation.
> 

Not sure what's the right solution. But I agree the current approach 
probably is not it.

>> 
>> 9) It's generally a good idea to make the individual pieces committable
>> separately, but that means e.g. the regression tests have to pass after
>> each patch. At the moment that does not seem to be the case for 0002,
>> see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
>> not sure if that's related.
> 
> This should definitely be a bug in json support, but I can't reproduce
> it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide
> a stack trace at least?
> 
I'll try.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Tomas Vondra
Date:

On 11/6/18 4:48 PM, Tomas Vondra wrote:
> On 11/6/18 3:31 PM, Nikita Glukhov wrote:
>> On 29.10.2018 2:20, Tomas Vondra wrote:>
>>
>> ...
>>>
>>> 9) It's generally a good idea to make the individual pieces committable
>>> separately, but that means e.g. the regression tests have to pass after
>>> each patch. At the moment that does not seem to be the case for 0002,
>>> see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
>>> not sure if that's related.
>>
>> This should definitely be a bug in json support, but I can't reproduce
>> it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide
>> a stack trace at least?
>>
> I'll try.
> 

Not sure why you can't reproduce the failures, it's perfectly
reproducible for me. For the record, I'm doing this:

./configure --prefix=/home/user/pg-jsonpath --enable-debug
--enable-cassert CFLAGS="-O0 -DRANDOMIZE_ALLOCATED_MEMORY" && make -s
clean && make -s -j4 && make check

After sticking Assert(false) to JsonEncodeJsonbValue (to the default
case), I get a failure like this:

  select json '{}' @* 'lax $[0]';
! WARNING:  unknown jsonb value type: 20938064
! server closed the connection unexpectedly
!     This probably means the server terminated abnormally
!     before or while processing the request.
! connection to server was lost

The backtrace is attached. My guess is JsonValueListGetList in
jsonb_jsonpath_query only does shallow copy instead of copying the
pieces into funcctx->multi_call_memory_ctx, so it gets corrupted on
subsequent calls.

I also attach valgrind report, but I suppose the reported issues are a
consequence of the same bug.

regard

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: jsonpath

From
Tomas Vondra
Date:
BTW is the v19 really just a rebase of the preceding version?

I'm asking because v18 was adding two types into pg_type.dat, namely
jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
(6050). I've noticed because v18 was missing a comma between the two
entries, which I'd say is a bug - interestingly enough, both types were
created correctly, so it seems to be resilient to such typos.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> BTW is the v19 really just a rebase of the preceding version?
> I'm asking because v18 was adding two types into pg_type.dat, namely
> jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
> (6050).

I haven't looked at this patch, but manual generation of array type entries
is obsolete since 3dc820c43e427371b66d217f2bd5481fc9ef2e2d.  There should
still be a mention of the OID to use for the array type though ...

            regards, tom lane


Re: jsonpath

From
Tomas Vondra
Date:


On 11/8/18 6:12 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> BTW is the v19 really just a rebase of the preceding version?
>> I'm asking because v18 was adding two types into pg_type.dat, namely
>> jsonpath (6050) and _jsonpath (6051), while v19 only adds jsonpath
>> (6050).
> 
> I haven't looked at this patch, but manual generation of array type entries
> is obsolete since 3dc820c43e427371b66d217f2bd5481fc9ef2e2d.  There should
> still be a mention of the OID to use for the array type though ...
> 

Ah, I missed this improvement somehow. (Yes, the array type OID is still
mentioned as part of the type definition.)

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 20th version the jsonpath patches.


On 08.11.2018 4:52, Tomas Vondra wrote:
On 11/6/18 4:48 PM, Tomas Vondra wrote:
On 11/6/18 3:31 PM, Nikita Glukhov wrote:
On 29.10.2018 2:20, Tomas Vondra wrote:>
9) It's generally a good idea to make the individual pieces committable
separately, but that means e.g. the regression tests have to pass after
each patch. At the moment that does not seem to be the case for 0002,
see the attached file. I'm running with -DRANDOMIZE_ALLOCATED_MEMORY,
not sure if that's related.
This should definitely be a bug in json support, but I can't reproduce
it simply by defining -DRANDOMIZE_ALLOCATED_MEMORY.  Could you provide
a stack trace at least?
I'll try.
Not sure why you can't reproduce the failures, it's perfectly
reproducible for me. For the record, I'm doing this:

./configure --prefix=/home/user/pg-jsonpath --enable-debug
--enable-cassert CFLAGS="-O0 -DRANDOMIZE_ALLOCATED_MEMORY" && make -s
clean && make -s -j4 && make check

After sticking Assert(false) to JsonEncodeJsonbValue (to the default
case), I get a failure like this:
 select json '{}' @* 'lax $[0]';
! WARNING:  unknown jsonb value type: 20938064
! server closed the connection unexpectedly
! 	This probably means the server terminated abnormally
! 	before or while processing the request.
! connection to server was lost

The backtrace is attached. My guess is JsonValueListGetList in
jsonb_jsonpath_query only does shallow copy instead of copying the
pieces into funcctx->multi_call_memory_ctx, so it gets corrupted on
subsequent calls.

I also attach valgrind report, but I suppose the reported issues are a
consequence of the same bug.
Than you for your help, I finally have managed to reproduce this bug.

The problem was really related to copying values but it was not located in the
jsonb_jsonpath_query() where the whole value list should already be allocated in 
funcctx->multi_call_memory_ctx (executeJsonPath() is called in that context and
also input json/jsonb is copied into it).  Stack-allocated root JsonbValue was
not copied to the heap due to wrong copy condition in recursiveExecuteNoUnwrap()
(jpiIndexArray case):

@@ -1754,1 +1754,1 @@ recursiveExecuteNoUnwrap(JsonPathExecContext *cxt, ...
-   res = recursiveExecuteNext(cxt, jsp, &elem, v, found, !binary);
+   res = recursiveExecuteNext(cxt, jsp, &elem, v, found, singleton || !binary);

I have refactored a bit this place extracting explicit 'copy' flag.

Also I found a similar problem in makePassingVars() where Numerics were not
copied into 'multi_call_memory_ctx' from the input json object. I decided to use
datumCopy() inside makePassingVars() instead of copying the whole input json in
jsonb_jsonpath_query() using PG_GETARG_JSONB_P_COPY().  But I think that
processing of the input json with variables should ideally be bone lazily,
just don't know if this refactoring is worth it because 3rd argument of
json[b]_jsonpath_xxx() functions is used only for testing of variables passing,
SQL/JSON functions use the different mechanism.

On 06.11.2018 18:48, Tomas Vondra wrote:
But FF7-FF9 are questionable since the maximal supported precision is only 6.
They are optional by the standard:
   95) Specifications for Feature F555, “Enhanced seconds precision”:
     d) Subclause 9.44, “Datetime templates”:
       i) Without Feature F555, “Enhanced seconds precision”,
          a <datetime template fraction> shall not be FF7, FF8 or FF9.
So I decided to allow FF7-FF9 only on the output in to_char().
Hmmmm, isn't that against the standard then? I mean, if our precision is 
only 6, that probably means we don't have the F555 feature, which means 
FF7-9 should not be available. 
It also seems like a bit surprising behavior that we only allow those 
fields in to_char() and not in other places.
Ok, now error is thrown if FF7-FF9 are found in the format string.


On 06.11.2018 18:48, Tomas Vondra wrote:
4) There probably should be .gitignore rule for jsonpath_gram.h, just
like for other generated header files.
I see 3 rules in /src/backend/utils/adt/.gitignore:

/jsonpath_gram.h
/jsonpath_gram.c
/jsonpath_scan.c
But there's a symlink in src/include/utils/jsonpath_gram.h and that's 
not in .gitignore.
Added jsonpath_gram.h to /src/include/utils/.gitignore.
-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: jsonpath

From
Tomas Vondra
Date:
Hi,

I've done another round of reviews on v20, assuming the patch is almost
ready to commit, but unfortunately I ran into a bunch of issues that
need to be resolved. None of this is a huge issue, but it's also above
the threshold of what could be tweaked by a committer IMHO.

(Which brings the question who plans to commit this. The patch does not
have a committer in the CF app, but I see both Teodor and Alexander are
listed as it's authors, so I'd expect it to be one of those. Or I might
do that, of course.)


0001
----

1) to_timestamp() does this:

  do_to_timestamp(date_txt, VARDATA(fmt), VARSIZE_ANY_EXHDR(fmt), false,
                  &tm, &fsec, &fprec, NULL);

Shouldn't it really do VARDATA_ANY() instead of VARDATA()? It's what the
function did before (well, it called text_to_cstring, but that does
VARDATA_ANY). The same thing applies to to_date(), BTW.

I also find it a bit inconvenient that we expand the fmt like this in
all do_to_timestamp() calls, although it's just to_datetime() that needs
to do it this way. I realize we can't change to_datetime() because it's
external API, but maybe we should make it construct a varlena and pass
it to do_to_timestamp().


2) We define both DCH_FF# and DCH_ff#, but we never ever use the
lower-case version. Heck, it's not mentioned even in DCH_keywords, which
does this:

  ...
  {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...
  {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
  ...

Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
"day". Also, the comment for "ff1" is wrong, it should be "f" I guess.

And of course, there are regression tests for FF# variants, but not the
lower-case ones.


3) Of course, these new formatting patterns should be added to SGML
docs, for example to "Template Patterns for Date/Time Formatting" in
func.sgml (and maybe to other places, I haven't looked for them very
thoroughly).


4) The last block in DCH_from_char() does this

    while (*s == ' ')
        s++;

but I suppose it should use isspace() just like the code immediately
before it.


5) I might be missing something, but why is there the "D" at the end of
the return flags from DCH_from_char?

    /* Return flags for DCH_from_char() */
    #define DCH_DATED    0x01
    #define DCH_TIMED    0x02
    #define DCH_ZONED    0x04


0002
----

1) There are some unnecessary changes to to_datetime signature (date_txt
renamed to vs. datetxt), which is mostly minor but unnecessary churn.


2) There are some extra changes to to_datetime (extra parameters, etc.).
I wonder why those are not included in 0001, as part of the supporting
datetime infrastructure.


3) I'm not sure whether the _safe functions are a good or bad idea, but
at the very least the comments should be updated to explain what it does
(as the API has changed, obviously).


4) the json.c changes are under-documented, e.g. there are no comments
for lex_peek_value, JsonEncodeDateTime doesn't say what tzp is for and
whether it needs to be specified all the time, half of the functions at
the end don't have comments (some of them are really simple, but then
other simple functions do have comments).

I don't know what the right balance here is (I'm certainly not asking
for bogus comments just to have comments) and I agree that the existing
code is not exactly abundant in comments. But perhaps having at least
some would be nice.

The same thing applies to jsonpath.c and jsonpath_exec.c, I think. There
are pretty much no comments whatsoever (at least at the function level,
explaining what the functions do). It would be good to have a file-level
comment too, explaining how jsonpath works, probably.


5) I see uniqueifyJsonbObject now does this:

    if (!object->val.object.uniquified)
        return;

That seems somewhat strange, considering the function says it'll
uniqueify objects, but then exits when noticing the passed object is not
uniquified?


6) Why do we make make_result inline? (numeric.c) If this needs to be
marked with "inline" then perhaps all the _internal functions should be
marked like that too? I have my doubts about the need for this.


7) The other places add _safe to functions that don't throw errors
directly and instead update edata. Why are set_var_from_str, div_var,
mod_var excluded from this convention?


8) I wonder if the changes in numeric can have negative impact on
performance. Has anyone done any performance tests of this part?


0003
----

1) json_gin.c should probably add comments briefly explaining what
JsonPathNode, GinEntries, ExtractedPathEntry, ExtractedJsonPath and
JsonPathExtractionContext are for.


2) I find it a bit suspicious that there are no asserts in json_gin.c
(well, there are 3 in the existing code, but nothing in the new code,
and the patch essentially doubles the amount of code here).


No comments for 0004 at this point.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

On Sat, Nov 24, 2018 at 11:03 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I've done another round of reviews on v20, assuming the patch is almost
> ready to commit, but unfortunately I ran into a bunch of issues that
> need to be resolved. None of this is a huge issue, but it's also above
> the threshold of what could be tweaked by a committer IMHO.
>
> (Which brings the question who plans to commit this. The patch does not
> have a committer in the CF app, but I see both Teodor and Alexander are
> listed as it's authors, so I'd expect it to be one of those. Or I might
> do that, of course.)

Thanks a lot for your efforts on this patch!  I was planning to work
on this patch during this commitfest (and probably commit).  But I
didn't manage to do this due to family circumstances (I got my baby
born and hardly could allocate time to work in November).  But I'm
still planning to commit this patch.  So, I'm going to set myself as
committer in commitfest application.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Dmitry Dolgov
Date:
> On Wed, Nov 28, 2018 at 9:26 PM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
>
> On Sat, Nov 24, 2018 at 11:03 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> > I've done another round of reviews on v20, assuming the patch is almost
> > ready to commit, but unfortunately I ran into a bunch of issues that
> > need to be resolved. None of this is a huge issue, but it's also above
> > the threshold of what could be tweaked by a committer IMHO.
> >
> > (Which brings the question who plans to commit this. The patch does not
> > have a committer in the CF app, but I see both Teodor and Alexander are
> > listed as it's authors, so I'd expect it to be one of those. Or I might
> > do that, of course.)
>
> Thanks a lot for your efforts on this patch!  I was planning to work
> on this patch during this commitfest (and probably commit).  But I
> didn't manage to do this due to family circumstances (I got my baby
> born and hardly could allocate time to work in November).  But I'm
> still planning to commit this patch.  So, I'm going to set myself as
> committer in commitfest application.

Great news, thank you! Just for the information, cfbot doesn't like the patch
on windows:

src/backend/utils/adt/jsonpath_exec.c(1539): error C2146: syntax error
: missing ')' before identifier 'id'
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2081: 'int32_t' :
name in formal parameter list illegal
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2061: syntax error
: identifier 'id' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2059: syntax error
: ';' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1539): error C2059: syntax error
: ')' [C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1540): error C2449: found '{' at
file scope (missing function header?)
[C:\projects\postgresql\postgres.vcxproj]
src/backend/utils/adt/jsonpath_exec.c(1548): error C2059: syntax error
: '}' [C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2146: syntax error : missing ')' before identifier 'id'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2081: 'int32_t' : name in formal parameter list illegal
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2061: syntax error : identifier 'id'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2059: syntax error : ';'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1539):
error C2059: syntax error : ')'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1540):
error C2449: found '{' at file scope (missing function header?)
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
c:\projects\postgresql\src\backend\utils\adt\jsonpath_exec.c(1548):
error C2059: syntax error : '}'
(src/backend/utils/adt/jsonpath_json.c)
[C:\projects\postgresql\postgres.vcxproj]
10 Warning(s)
14 Error(s)


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 21-st version of the patches rebased onto the current master.

Added new patch 0004 with documentation written by Liudmila Mantrova, and
optional patch 0003 that was separated from 0002.


On 24.11.2018 23:03, Tomas Vondra wrote:

Hi,

I've done another round of reviews on v20, assuming the patch is almost
ready to commit, but unfortunately I ran into a bunch of issues that
need to be resolved. None of this is a huge issue, but it's also above
the threshold of what could be tweaked by a committer IMHO.
Thank your for your review.
(Which brings the question who plans to commit this. The patch does not
have a committer in the CF app, but I see both Teodor and Alexander are
listed as it's authors, so I'd expect it to be one of those. Or I might
do that, of course.)
0001
----

1) to_timestamp() does this:
 do_to_timestamp(date_txt, VARDATA(fmt), VARSIZE_ANY_EXHDR(fmt), false,                 &tm, &fsec, &fprec, NULL);

Shouldn't it really do VARDATA_ANY() instead of VARDATA()? It's what the
function did before (well, it called text_to_cstring, but that does
VARDATA_ANY). The same thing applies to to_date(), BTW.

I also find it a bit inconvenient that we expand the fmt like this in
all do_to_timestamp() calls, although it's just to_datetime() that needs
to do it this way. I realize we can't change to_datetime() because it's
external API, but maybe we should make it construct a varlena and pass
it to do_to_timestamp().
Of course, there must be VARDATA_ANY() instead of of VARDATA().  But I decided
to construct varlena and pass it to do_to_timestamp() and to to_datetime().

2) We define both DCH_FF# and DCH_ff#, but we never ever use the
lower-case version. Heck, it's not mentioned even in DCH_keywords, which
does this:
 ... {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */ ... {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */ ...

Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
"day". 
Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.

"Day", "day" are not mapped to DCH_DAY because they determine letter case in the
output, but "ff1" and "FF#" output contains only digits.

Also, the comment for "ff1" is wrong, it should be "f" I guess.
Fixed.
And of course, there are regression tests for FF# variants, but not the
lower-case ones.

Tests were added

3) Of course, these new formatting patterns should be added to SGML
docs, for example to "Template Patterns for Date/Time Formatting" in
func.sgml (and maybe to other places, I haven't looked for them very
thoroughly).

Fixed.

4) The last block in DCH_from_char() does this
   while (*s == ' ')       s++;

but I suppose it should use isspace() just like the code immediately
before it.

Fixed.

5) I might be missing something, but why is there the "D" at the end of
the return flags from DCH_from_char?
   /* Return flags for DCH_from_char() */   #define DCH_DATED    0x01   #define DCH_TIMED    0x02   #define DCH_ZONED    0x04
These terms "dated", "timed", and "zoned" are taken from the standard:

9.39.10) If <JSON datetime template> JDT is specified, ... : a) If JDT contains <datetime template year>, ... then JDT is dated. b) If JDT contains <datetime template 12-hour>, ... then JDT is timed. c) If JDT contains <datetime template time zone hour>, ... then JDT is zoned.


0002
----

1) There are some unnecessary changes to to_datetime signature (date_txt
renamed to vs. datetxt), which is mostly minor but unnecessary churn.

2) There are some extra changes to to_datetime (extra parameters, etc.).
I wonder why those are not included in 0001, as part of the supporting
datetime infrastructure.
Extra changes to to_datetime() were moved into patch 0001.

3) I'm not sure whether the _safe functions are a good or bad idea, but
at the very least the comments should be updated to explain what it does
(as the API has changed, obviously).
 
This functions were introduced for elimination of PG_TRY/PG_CATCH in jsonpath
code.  But this error handling approach has a lot of issues (see below), so I
decided separate again them into optional patch 0004.

4) the json.c changes are under-documented, e.g. there are no comments
for lex_peek_value, JsonEncodeDateTime doesn't say what tzp is for and
whether it needs to be specified all the time, half of the functions at
the end don't have comments (some of them are really simple, but then
other simple functions do have comments).

I don't know what the right balance here is (I'm certainly not asking
for bogus comments just to have comments) and I agree that the existing
code is not exactly abundant in comments. But perhaps having at least
some would be nice.
Some new comments were added to json.c.
The same thing applies to jsonpath.c and jsonpath_exec.c, I think. There
are pretty much no comments whatsoever (at least at the function level,
explaining what the functions do). It would be good to have a file-level
comment too, explaining how jsonpath works, probably.

I hope to add detailed comments for jsonpath in the next version of the patches.

5) I see uniqueifyJsonbObject now does this:
if (!object->val.object.uniquified)	return;

That seems somewhat strange, considering the function says it'll
uniqueify objects, but then exits when noticing the passed object is not
uniquified?

'uniquified' field was rename to 'uniquify'.

6) Why do we make make_result inline? (numeric.c) If this needs to be
marked with "inline" then perhaps all the _internal functions should be
marked like that too? I have my doubts about the need for this.


7) The other places add _safe to functions that don't throw errors
directly and instead update edata. Why are set_var_from_str, div_var,
mod_var excluded from this convention?
I added "_safe" postfix only to functions having the both variants: safe (error
info is placed into *edata) and unsafe (errors are always thrown).

One-line make_result() is called from many unsafe places, so I decided to leave
it as it was.  It can also be defined as a simple macro.

New "_internal" functions are called from jsonpath_exec.c, so they can't be
"static inline", but "extern inline" seems to be non-portable.

8) I wonder if the changes in numeric can have negative impact on
performance. Has anyone done any performance tests of this part?
I've tried to measure this impact by evaluation of expression with dozens of '+'
or even by adding a loop into numeric_add(), but failed to get any measurable
difference.

0003
----

1) jsonb_gin.c should probably add comments briefly explaining what
JsonPathNode, GinEntries, ExtractedPathEntry, ExtractedJsonPath and
JsonPathExtractionContext are for.
A lot of comments were added. Also major refactoring of jsonb_gin.c was done.

2) I find it a bit suspicious that there are no asserts in json_gin.c
(well, there are 3 in the existing code, but nothing in the new code,
and the patch essentially doubles the amount of code here).
I have managed to find only 4 similar places suitable for asserts.

No comments for 0004 at this point.

regards

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> does this:
>
>   ...
>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>   ...
>
> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> "day".
>
> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>
> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
> output, but "ff1" and "FF#" output contains only digits.

Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
an entry for "ff1" then enum should have a DCH_ff1 member in the same
position.

I got some other questions regarding this patchset.

1) Why do we parse FF7-FF9 if we're not supporting them anyway?
Without defining FF7-FF9 we can also don't throw errors for them
everywhere.  That would save us some code lines.

2) + DCH_to_char_fsec("%01d", in->fsec / INT64CONST(100000));
Why do we use INT64CONST() here and in the similar places assuming
that fsec is only uint32?

3) wrapItem() is unused in
0002-Jsonpath-engine-and-operators-v21.patch, but used in
0006-Jsonpath-syntax-extensions-v21.patch.  Please, move it to
0006-Jsonpath-syntax-extensions-v21.patch?

4) I also got these couple of warning during compilation.

jsonpath_exec.c:1485:1: warning: unused function
'recursiveExecuteNested' [-Wunused-function]
recursiveExecuteNested(JsonPathExecContext *cxt, JsonPathItem *jsp,
^
1 warning generated.
jsonpath_scan.l:444:6: warning: implicit declaration of function
'jsonpath_yyparse' is invalid in C99 [-Wimplicit-function-declaration]
        if (jsonpath_yyparse((void*)&parseresult) != 0)
            ^
1 warning generated.

Perhaps recursiveExecuteNested() is unsed in this patchset.  It's
probably used by some subsequent SQL/JSON-related patchset.  So,
please, move it there.

5) I think each usage of PG_TRY()/PG_CATCH() deserves comment
describing why it's safe to use without subtransaction.  In fact we
should just state that no function called inside performs data
modification.

------
Alexander Korotkov

Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Tomas Vondra
Date:
On 1/5/19 1:11 AM, Alexander Korotkov wrote:
> On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
>> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
>> does this:
>>
>>   ...
>>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>   ...
>>
>> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
>> "day".
>>
>> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>>
>> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
>> output, but "ff1" and "FF#" output contains only digits.
> 
> Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> an entry for "ff1" then enum should have a DCH_ff1 member in the same
> position.
> 

I guess my question is why we define DCH_ff# at all, when it's not
mapped in DCH_keywords? ISTM we could simply leave them out of all the
arrays, no? Of course, this is not specific to this patch, as it applies
to pre-existing items (like DCH_mi).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Jan 5, 2019 at 5:21 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 1/5/19 1:11 AM, Alexander Korotkov wrote:
> > On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> >> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
> >> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
> >> does this:
> >>
> >>   ...
> >>   {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>   {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
> >>   ...
> >>
> >> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
> >> "day".
> >>
> >> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
> >>
> >> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
> >> output, but "ff1" and "FF#" output contains only digits.
> >
> > Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> > an entry for "ff1" then enum should have a DCH_ff1 member in the same
> > position.
> >
>
> I guess my question is why we define DCH_ff# at all, when it's not
> mapped in DCH_keywords? ISTM we could simply leave them out of all the
> arrays, no?

I think we still need separate array entries for "FF1" and "ff1".  At
least, as long as we don't allow "fF1" and "Ff1".  In order to get rid
of DCH_ff#, I think we should decouple DCH_keywords from array
indexes.

> Of course, this is not specific to this patch, as it applies
> to pre-existing items (like DCH_mi).

Right, that should be a subject of separate patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 23rd version of the patches.


On 05.01.2019 3:11, Alexander Korotkov wrote:
> On Tue, Dec 4, 2018 at 2:23 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>> 2) We define both DCH_FF# and DCH_ff#, but we never ever use the
>> lower-case version. Heck, it's not mentioned even in DCH_keywords, which
>> does this:
>>
>>    ...
>>    {"FF1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>    ...
>>    {"ff1", 3, DCH_FF1, false, FROM_CHAR_DATE_NONE},  /* F */
>>    ...
>>
>> Compare that to DCH_DAY, DCH_Day and DCH_day, mapped to "DAY", "Day" and
>> "day".
>>
>> Yes, "ff#" are mapped to DCH_FF# like "mi" is mapped DCH_MI.
>>
>> "Day", "day" are not mapped to DCH_DAY because they determine letter case in the
>> output, but "ff1" and "FF#" output contains only digits.
> Right, DCH_poz is also offset in DCH_keywords array.  So, if array has
> an entry for "ff1" then enum should have a DCH_ff1 member in the same
> position.
>
> I got some other questions regarding this patchset.
>
> 1) Why do we parse FF7-FF9 if we're not supporting them anyway?
> Without defining FF7-FF9 we can also don't throw errors for them
> everywhere.  That would save us some code lines.
FF7-FF9 were removed.
> 2) + DCH_to_char_fsec("%01d", in->fsec / INT64CONST(100000));
> Why do we use INT64CONST() here and in the similar places assuming
> that fsec is only uint32?
Fixed.

> 3) wrapItem() is unused in
> 0002-Jsonpath-engine-and-operators-v21.patch, but used in
> 0006-Jsonpath-syntax-extensions-v21.patch.  Please, move it to
> 0006-Jsonpath-syntax-extensions-v21.patch?
wraptItem() was moved into patch #6.

> 4) I also got these couple of warning during compilation.
>
> jsonpath_exec.c:1485:1: warning: unused function
> 'recursiveExecuteNested' [-Wunused-function]
> recursiveExecuteNested(JsonPathExecContext *cxt, JsonPathItem *jsp,
> ^
> 1 warning generated.
> jsonpath_scan.l:444:6: warning: implicit declaration of function
> 'jsonpath_yyparse' is invalid in C99 [-Wimplicit-function-declaration]
>          if (jsonpath_yyparse((void*)&parseresult) != 0)
>              ^
> 1 warning generated.
>
> Perhaps recursiveExecuteNested() is unsed in this patchset.  It's
> probably used by some subsequent SQL/JSON-related patchset.  So,
> please, move it there.

recursiveExecuteNested() was removed from this patch set.

Prototype for jsonpath_yyparse() should be in the Bison-generated file
src/include/adt/jsonpath_gram.h.

> 5) I think each usage of PG_TRY()/PG_CATCH() deserves comment
> describing why it's safe to use without subtransaction.  In fact we
> should just state that no function called inside performs data
> modification.
Comments were added.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
Nikita, thank you!

I noticed another thing.  3-arguments version of functions
jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
jsonpath_query_wrapped() are:

1) Not documented
2) Not used in operator definitions
3) Not used in subsequent patches

So, it looks like we can just remove then.  But I think we're very
likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
about other patches.  So, having those functions exposed to user can
be extremely useful until we have separate nodes for SQL/JSON.  So, I
vote to document these functions.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Tomas Vondra
Date:
On 1/18/19 6:58 PM, Alexander Korotkov wrote:
> Nikita, thank you!
> 
> I noticed another thing.  3-arguments version of functions
> jsonpath_exists(), jsonpath_predicate(), jsonpath_query(),
> jsonpath_query_wrapped() are:
> 
> 1) Not documented
> 2) Not used in operator definitions
> 3) Not used in subsequent patches
> 
> So, it looks like we can just remove then.  But I think we're very
> likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> about other patches.  So, having those functions exposed to user can
> be extremely useful until we have separate nodes for SQL/JSON.  So, I
> vote to document these functions.
> 

That seems a bit strange. If those functions are meant to be used by
other patches (which ones?) then why should not we make them part of
those future patches?

But it seems more like those functions are actually meant to be used by
users in the first place, in cases when we need to provide a third
parameter (which operators can't do). In that case yes - we should have
them documented properly, but also tested. Which is not the case
currently, because the regression tests only use the operators.

Two more comments:

1) I'm wondering why we even need separate functions for the different
numbers of arguments at the C level, as both simply call to the same
function anyway with a PG_NARGS() condition in it. Can't we ditch those
wrappers and reference that target function directly?

2) I once again ran into the jsonb->json business, which essentially
means that when you do this:

    select json '{"a": { "b" : 10}}' @? '$.a.b';

it ends up calling jsonb_jsonpath_exists(), which then does this:

    Jsonb *jb = PG_GETARG_JSONB_P(0);

I and am not sure why/how it seems to work, but I find it confusing
because the stack still looks like this:

#0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
#1  0x000000000096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
jsonpath_exec.c:2882
#2  0x00000000006c790a in ExecInterpExpr (state=0x162f300,
econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
...

and of course, the fcinfo->flinfo still points to the json-specific
function, which say the first parameter type is 'json'.

(gdb) print *fcinfo->flinfo
$23 = {fn_addr = 0x96d709 <json_jsonpath_exists2>, fn_oid = 6043,
fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}


test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
     proname     |        prosrc         | proargtypes
-----------------+-----------------------+-------------
 jsonpath_exists | json_jsonpath_exists2 | 114 6050
(1 row)

test=# select typname from pg_type where oid = 114;
 typname
---------
 json
(1 row)


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Jan 19, 2019 at 1:32 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 1/18/19 6:58 PM, Alexander Korotkov wrote:
> > So, it looks like we can just remove then.  But I think we're very
> > likely can commit jsonpath patch to PostgreSQL 12, but I'm not sure
> > about other patches.  So, having those functions exposed to user can
> > be extremely useful until we have separate nodes for SQL/JSON.  So, I
> > vote to document these functions.
>
> That seems a bit strange. If those functions are meant to be used by
> other patches (which ones?) then why should not we make them part of
> those future patches?

No, these functions aren't used by other patches (it was my original
wrong idea).  Other patches provides SQL expressions making these
functions not that necessary.  But I think we should keep these
functions in jsonpath patch.

> But it seems more like those functions are actually meant to be used by
> users in the first place, in cases when we need to provide a third
> parameter (which operators can't do). In that case yes - we should have
> them documented properly, but also tested. Which is not the case
> currently, because the regression tests only use the operators.

+1
Thank you for noticing.  Tests should be provided as well.

> Two more comments:
>
> 1) I'm wondering why we even need separate functions for the different
> numbers of arguments at the C level, as both simply call to the same
> function anyway with a PG_NARGS() condition in it. Can't we ditch those
> wrappers and reference that target function directly?

That was surprising for me too.  Technically, it's OK to do this.  And
we do this for extensions.  But for in-core functions we have
following sanity check.

-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields).  It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of.  (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.

SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
    p1.prosrc = p2.prosrc AND
    p1.prolang = 12 AND p2.prolang = 12 AND
    (p1.prokind != 'a' OR p2.prokind != 'a') AND
    (p1.prolang != p2.prolang OR
     p1.prokind != p2.prokind OR
     p1.prosecdef != p2.prosecdef OR
     p1.proleakproof != p2.proleakproof OR
     p1.proisstrict != p2.proisstrict OR
     p1.proretset != p2.proretset OR
     p1.provolatile != p2.provolatile OR
     p1.pronargs != p2.pronargs);

And we already have some code written especially to make this check happy.

/* This is separate to keep the opr_sanity regression test from complaining */
Datum
regexp_split_to_table_no_flags(PG_FUNCTION_ARGS)
{
    return regexp_split_to_table(fcinfo);
}

Personally I'm not fan of this approach, and I would rather relax this
sanity check.  But that doesn't seem to be a subject of this patch.
For jsonpath, it's OK to just keep this tradition.

> 2) I once again ran into the jsonb->json business, which essentially
> means that when you do this:
>
>     select json '{"a": { "b" : 10}}' @? '$.a.b';
>
> it ends up calling jsonb_jsonpath_exists(), which then does this:
>
>     Jsonb *jb = PG_GETARG_JSONB_P(0);
>
> I and am not sure why/how it seems to work, but I find it confusing
> because the stack still looks like this:
>
> #0  jsonb_jsonpath_exists (fcinfo=0x162f800) at jsonpath_exec.c:2857
> #1  0x000000000096d721 in json_jsonpath_exists2 (fcinfo=0x162f800) at
> jsonpath_exec.c:2882
> #2  0x00000000006c790a in ExecInterpExpr (state=0x162f300,
> econtext=0x162ee18, isnull=0x7ffcea4c3857) at execExprInterp.c:648
> ...
>
> and of course, the fcinfo->flinfo still points to the json-specific
> function, which say the first parameter type is 'json'.
>
> (gdb) print *fcinfo->flinfo
> $23 = {fn_addr = 0x96d709 <json_jsonpath_exists2>, fn_oid = 6043,
> fn_nargs = 2, fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
> fn_extra = 0x0, fn_mcxt = 0x162e990, fn_expr = 0x15db378}
>
>
> test=# select proname, prosrc, proargtypes from pg_proc where oid = 6043;
>      proname     |        prosrc         | proargtypes
> -----------------+-----------------------+-------------
>  jsonpath_exists | json_jsonpath_exists2 | 114 6050
> (1 row)
>
> test=# select typname from pg_type where oid = 114;
>  typname
> ---------
>  json
> (1 row)

It's tricky.  There is jsonpath_json.c, which includes jsonpath_exec.c
with set of macro definitions making that code work with json type.
I'm going to refactor that.  But general approach to include same
functions more than once seems OK for me.

Some more notes:

1) It seems that @* and @# are not going to be supported by any
indexes.  I think we should remove these operators and let users use
functions instead.
2) I propose to rename @~ operator to @@.  We already use @@ as
"satisfies" in multiple places, and I thinks this case fits too.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Jan 19, 2019 at 2:54 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> 1) It seems that @* and @# are not going to be supported by any
> indexes.  I think we should remove these operators and let users use
> functions instead.
> 2) I propose to rename @~ operator to @@.  We already use @@ as
> "satisfies" in multiple places, and I thinks this case fits too.

3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 10000000000?

                id = jb->type != jbvBinary ? 0 :
                    (int64)((char *) jb->val.binary.data -
                            (char *) cxt->baseObject.jbc);
                id += (int64) cxt->baseObject.id * INT64CONST(10000000000);

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Jan 20, 2019 at 2:45 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> 3) How do we calculate the "id" property returned by keyvalue()
> function?  It's not documented.  Even presence of "id" columns isn't
> documented.  Standard stands that it's implementation-depended
> indetifier of object holding key-value pair.  The way of its
> calculation is also not clear from the code.  Why do we need constant
> of 10000000000?
>
>                 id = jb->type != jbvBinary ? 0 :
>                     (int64)((char *) jb->val.binary.data -
>                             (char *) cxt->baseObject.jbc);
>                 id += (int64) cxt->baseObject.id * INT64CONST(10000000000);

I've revising patchset bringing it to commitable shape.  The
intermediate result is attached.

0001-Preliminary-datetime-infrastructure-v24.patch

* Commit message & minor cleanup

0002-Jsonpath-engine-and-docs-v24.patch

 * Support of json is removed.  Current implementation is tricky and
cumbersome.  We need to design a suitable abstraction layer for that.
It should be done by separate patch applying not only to jsonpath, but
also to other jsonb functions lacking of json analogues.
* jsonpath_exists(jsonb, jsonpath[, jsonb]), jsonpath_predicate(jsonb,
jsonpath[, jsonb]), jsonpath_query(jsonb, jsonpath[, jsonb]),
jsonpath_wrapped(jsonb, jsonpath[, jsonb]) are documented.
* @* and @# operators are removed, @~ operator is renamed to @@.
* Commit message & minor cleanup.

I'll continue revising this patchset.  Nikita, could you please write
tests for 3-argument versions of functions?  Also, please answer the
question regarding "id" property.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I'll continue revising this patchset.  Nikita, could you please write
> tests for 3-argument versions of functions?  Also, please answer the
> question regarding "id" property.

I've some more notes regarding function set provided in jsonpath patch:
1) Function names don't follow the convention we have.  All our
functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
"jsonb" in other part of name, for example, "to_jsonb(anyelement)".
We could name functions at SQL level in the same way they are named in
C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
too long.  What about jsonb_path_exists() etc?
2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
about jsonpath_query_array()?
3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
NULL for no results.  When result item is one, it is returned "as is".
When there are two or more result items, they are wrapped into array.
This representation of result seems extremely inconvenient for me.  It
becomes hard to solve even trivial tasks with that: for instance,
iterate all items found.  Without extra assumptions on what is going
to be returned it's also impossible for caller to distinguish single
array item found from multiple items found wrapped into array.  And
that seems very bad.  I know this behavior is inspired by SQL/JSON
standard.  But since these functions are anyway our extension, we can
implement them as we like.  So, I propose to make this function always
return array of items regardless on count of those items (even for 1
or 0 items).
4) If we change behavior of jsonpath_query_wrapped() as above, we can
also implement jsonpath_query_one() function, which would return first
result item or NULL for no items.
Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:

On 20.01.2019 2:45, Alexander Korotkov wrote:

3) How do we calculate the "id" property returned by keyvalue()
function?  It's not documented.  Even presence of "id" columns isn't
documented.  Standard stands that it's implementation-depended
indetifier of object holding key-value pair.  The way of its
calculation is also not clear from the code.  Why do we need constant
of 10000000000?
               id = jb->type != jbvBinary ? 0 :                   (int64)((char *) jb->val.binary.data -                           (char *) cxt->baseObject.jbc);               id += (int64) cxt->baseObject.id * INT64CONST(10000000000);
I decided to construct object id from the two parts: base object id and its
binary offset in its base object's jsonb:

object_id = 10000000000 * base_object_id + object_offset_in_base_object


10000000000 (10^10) -- is a first round decimal number greater than 2^32
(maximal offset in jsonb).  Decimal multiplier is used here to improve the
readability of identifiers.


Base object is usually a root object of the path: context item '$' or path
variable '$var', literals can't produce objects for now.  But if the path
contains generated objects (.keyvalue() itself, for example), then they become
base object for the subsequent .keyvalue().  See example:

'$.a.b.keyvalue().value.keyvalue()' :- base for the first .keyvalue() is '$'- base for the second .keyvalue() is '$.a.b.keyvalue()'


Id of '$' is 0.
Id of '$var' is its ordinal (positive) number in the list of variables.
Ids for generated objects are assigned using global counter
'JsonPathExecContext.generatedObjectId' starting from 'number_of_vars + 1'.


Corresponding comments will be added in the upcoming version of the patches.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Oleg Bartunov
Date:
On Mon, Jan 21, 2019 at 1:40 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > I'll continue revising this patchset.  Nikita, could you please write
> > tests for 3-argument versions of functions?  Also, please answer the
> > question regarding "id" property.
>
> I've some more notes regarding function set provided in jsonpath patch:
> 1) Function names don't follow the convention we have.  All our
> functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
> "jsonb" in other part of name, for example, "to_jsonb(anyelement)".
> We could name functions at SQL level in the same way they are named in
> C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
> too long.  What about jsonb_path_exists() etc?

jsonpath_exists is similar to xpath_exists.
Actually, we could use jsonb_exists(jsonb, jsonpath, json), but then
we should specify the type of the second argument.

> 2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
> about jsonpath_query_array()?

The question is should we try to provide a functional interface for
all options of
JSON_QUERY clause ?  The same is for  other SQL/JSON clauses.
Currently,  patch contains very limited subset of JSON_QUERY
functionality,  mostly for jsonpath testing.

> 3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
> NULL for no results.  When result item is one, it is returned "as is".
> When there are two or more result items, they are wrapped into array.
> This representation of result seems extremely inconvenient for me.  It
> becomes hard to solve even trivial tasks with that: for instance,
> iterate all items found.  Without extra assumptions on what is going
> to be returned it's also impossible for caller to distinguish single
> array item found from multiple items found wrapped into array.  And
> that seems very bad.  I know this behavior is inspired by SQL/JSON
> standard.  But since these functions are anyway our extension, we can
> implement them as we like.  So, I propose to make this function always
> return array of items regardless on count of those items (even for 1
> or 0 items).

Fair enough, but if we agree, that we provide an exact functionality of
SQL clauses, then better to follow the standard to avoid problems.


> 4) If we change behavior of jsonpath_query_wrapped() as above, we can
> also implement jsonpath_query_one() function, which would return first
> result item or NULL for no items.
> Any thoughts?

I think, that we should postpone this functional interface, which could be
added later.

>
> ------
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Mon, Jan 21, 2019 at 6:05 PM Oleg Bartunov <obartunov@postgrespro.ru> wrote:
> On Mon, Jan 21, 2019 at 1:40 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> >
> > On Sun, Jan 20, 2019 at 6:30 AM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > I'll continue revising this patchset.  Nikita, could you please write
> > > tests for 3-argument versions of functions?  Also, please answer the
> > > question regarding "id" property.
> >
> > I've some more notes regarding function set provided in jsonpath patch:
> > 1) Function names don't follow the convention we have.  All our
> > functions dealing with jsonb have "jsonb_" prefix.  Exclusions have
> > "jsonb" in other part of name, for example, "to_jsonb(anyelement)".
> > We could name functions at SQL level in the same way they are named in
> > C.  So, they would be jsonb_jsonpath_exists() etc.  But it's probably
> > too long.  What about jsonb_path_exists() etc?
>
> jsonpath_exists is similar to xpath_exists.

That's true.  The question is whether it's more important to follow
json[b] naming convention or xml/xpath naming convention?  I guess
json[b] naming convention is more important in our case.

> Actually, we could use jsonb_exists(jsonb, jsonpath, json), but then
> we should specify the type of the second argument.

Yes, but AFAICS the key point of json[b]_ prefixes is to evade
function overloading.  So, I'm -1 for use jsonb_ prefix and have
function overload because of that.

> > 2) jsonpath_query_wrapped() name seems counter intuitive for me.  What
> > about jsonpath_query_array()?
>
> The question is should we try to provide a functional interface for
> all options of
> JSON_QUERY clause ?  The same is for  other SQL/JSON clauses.
> Currently,  patch contains very limited subset of JSON_QUERY
> functionality,  mostly for jsonpath testing.

Actually, my point is following.  We have jsonpath patch close to the
committable shape.  And we have patch for SQL/JSON clauses including
JSON_QUERY, which is huge, complex and didn't receive any serious
review yet.  So, we'll did our best on that patch during this release
cycle, but I can't guarantee it will get in PostgreSQL 12.  Thus, my
idea is to make jsonpath patch self contained by providing brief and
convenient procedural interface.  This procedural interface is anyway
not a part of standard.  It *might* be inspired by standard clauses,
but might be not.  I think we should try to make this procedural
interface as good and convenient by itself.  It's our extension, and
it wouldn't make us more or less standard conforming.

> > 3) jsonpath_query_wrapped() behavior looks tricky to me.  It returns
> > NULL for no results.  When result item is one, it is returned "as is".
> > When there are two or more result items, they are wrapped into array.
> > This representation of result seems extremely inconvenient for me.  It
> > becomes hard to solve even trivial tasks with that: for instance,
> > iterate all items found.  Without extra assumptions on what is going
> > to be returned it's also impossible for caller to distinguish single
> > array item found from multiple items found wrapped into array.  And
> > that seems very bad.  I know this behavior is inspired by SQL/JSON
> > standard.  But since these functions are anyway our extension, we can
> > implement them as we like.  So, I propose to make this function always
> > return array of items regardless on count of those items (even for 1
> > or 0 items).
>
> Fair enough, but if we agree, that we provide an exact functionality of
> SQL clauses, then better to follow the standard to avoid problems.

No, I see this as our extension.  And I don't see problems in being
different from standard clauses, because it's different anyway.  For
me, in this case it's better to evade problems of users.  And current
behavior of this function seems like just single big pain :)

> > 4) If we change behavior of jsonpath_query_wrapped() as above, we can
> > also implement jsonpath_query_one() function, which would return first
> > result item or NULL for no items.
> > Any thoughts?
>
> I think, that we should postpone this functional interface, which could be
> added later.

The reason we typically postpone things is that they are hard to bring
to committable shape.  jsonpath_query_one() doesn't cost us any real
development.  So, I don't see point in postponing that if consider
that as good part of procedural jsonpath interface.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
The next revision is attached.

Nikita made code and documentation improvements, renamed functions
from "jsonpath_" prefix to "jsonb_path_" prefix.  He also renamed
jsonpath_predicate() to jsonb_path_match() (that looks better for me
too).

I've further renamed jsonb_query_wrapped() to jsonb_query_array(), and
changed that behavior to always wrap result into array.  Also, I've
introduced new function jsonb_query_first(), which returns first item
from result.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Oleg Bartunov
Date:
On Tue, Jan 22, 2019 at 8:21 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> The next revision is attached.
>
> Nikita made code and documentation improvements, renamed functions
> from "jsonpath_" prefix to "jsonb_path_" prefix.  He also renamed
> jsonpath_predicate() to jsonb_path_match() (that looks better for me
> too).
>
> I've further renamed jsonb_query_wrapped() to jsonb_query_array(), and
> changed that behavior to always wrap result into array.

agree with new names

so it mimic the behaviour of JSON_QUERY with UNCONDITIONAL WRAPPER option

 Also, I've
> introduced new function jsonb_query_first(), which returns first item
> from result.
>
> ------
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 26th version of the patches:
* Documentation:   - Fixed some mistakes   - Removed mention of syntax extensions not present in this patch   - Documented '.datetime(format, default_tz)'* Removed accidental syntax extension allowing to write '@.key' without '@'

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Jan 23, 2019 at 3:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 26th version of the patches:
>
>  * Documentation:
>     - Fixed some mistakes
>     - Removed mention of syntax extensions not present in this patch
>     - Documented '.datetime(format, default_tz)'
>  * Removed accidental syntax extension allowing to write '@.key' without '@'

Thank you!

I've made few more changes to patchset:
 * Change back interface of JsonbExtractScalar().  I decide it would
be better to keep it.
 * Run pgindent
 * Improve documentation of .keyvalue() function.

Finally, I'm going to commit this if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 22, 2019 at 12:13 PM Oleg Bartunov <obartunov@postgrespro.ru> wrote:
>
> On Tue, Jan 22, 2019 at 8:21 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> >
> > The next revision is attached.
> >
> > Nikita made code and documentation improvements, renamed functions
> > from "jsonpath_" prefix to "jsonb_path_" prefix.  He also renamed
> > jsonpath_predicate() to jsonb_path_match() (that looks better for me
> > too).
> >
> > I've further renamed jsonb_query_wrapped() to jsonb_query_array(), and
> > changed that behavior to always wrap result into array.
>
> agree with new names
>
> so it mimic the behaviour of JSON_QUERY with UNCONDITIONAL WRAPPER option

Good, thank you for feedback!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 28th version of the patches.

On 23.01.2019 8:01, Alexander Korotkov wrote:

On Wed, Jan 23, 2019 at 3:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
Attached 26th version of the patches:
* Documentation:   - Fixed some mistakes   - Removed mention of syntax extensions not present in this patch   - Documented '.datetime(format, default_tz)'* Removed accidental syntax extension allowing to write '@.key' without '@'
Thank you!

I've made few more changes to patchset:* Change back interface of JsonbExtractScalar().  I decide it would
be better to keep it.* Run pgindent
I have fixed indentation by adding new typedefs to 
src/tools/pgindent/typedefs.list.

Also I have renamed LikeRegexContext => JsonLikeRegexContext.

 * Improve documentation of .keyvalue() function.

Finally, I'm going to commit this if no objections.
Thank you!

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Jan 23, 2019 at 3:24 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 28th version of the patches.
>
> On 23.01.2019 8:01, Alexander Korotkov wrote:
>
> On Wed, Jan 23, 2019 at 3:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> Attached 26th version of the patches:
>
>  * Documentation:
>     - Fixed some mistakes
>     - Removed mention of syntax extensions not present in this patch
>     - Documented '.datetime(format, default_tz)'
>  * Removed accidental syntax extension allowing to write '@.key' without '@'
>
> Thank you!
>
> I've made few more changes to patchset:
>  * Change back interface of JsonbExtractScalar().  I decide it would
> be better to keep it.
>  * Run pgindent
>
> I have fixed indentation by adding new typedefs to
> src/tools/pgindent/typedefs.list.

Good catch, thank you!

> Also I have renamed LikeRegexContext => JsonLikeRegexContext.
>  * Improve documentation of .keyvalue() function.

Ok!

I've also fix bug with possible reference to already freed value of
jsonpath in jsonb_path_execute(), pointed by you.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Finally, I'm going to commit this if no objections.

BTW, I decided to postpone commit for few days.  Nikita and me are
still working on better error messages.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Jan 26, 2019 at 4:27 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Finally, I'm going to commit this if no objections.
>
> BTW, I decided to postpone commit for few days.  Nikita and me are
> still working on better error messages.

Updated patchset is attached.  This patchset includes:

* Improved error handling by Nikita, revised by me,
* Code beautification.

So, I'm going to commit this again.  This time seriously :)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Jan 27, 2019 at 1:50 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sat, Jan 26, 2019 at 4:27 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> >
> > On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > Finally, I'm going to commit this if no objections.
> >
> > BTW, I decided to postpone commit for few days.  Nikita and me are
> > still working on better error messages.
>
> Updated patchset is attached.  This patchset includes:
>
> * Improved error handling by Nikita, revised by me,
> * Code beautification.
>
> So, I'm going to commit this again.  This time seriously :)

I'm really sorry for confusing people, but I've one more revision.
This is my first time attempting to commit such a large patch.

Major changes are following:
 * We find it ridiculous to save ErrorData for possible latter throw.
Now, we either throw an error immediately or return jperError.  That
also allows to get rid of unwanted changes in elog.c/elog.h.
 * I decided to change behavior of jsonb_path_match() to throw as less
errors as possible.  The reason is that it's used to implement
potentially (patch is pending) indexable operator.  Index scan is not
always capable to throw as many errors and sequential scan.  So, it's
better to not introduce extra possible index scan and sequential scan
results divergence.

So, this is version I'm going to commit unless Nikita has corrections
or anybody else objects.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tomas Vondra
Date:

On 1/28/19 1:22 AM, Alexander Korotkov wrote:
> On Sun, Jan 27, 2019 at 1:50 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
>> On Sat, Jan 26, 2019 at 4:27 AM Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>>>
>>> On Wed, Jan 23, 2019 at 8:01 AM Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> Finally, I'm going to commit this if no objections.
>>>
>>> BTW, I decided to postpone commit for few days.  Nikita and me are
>>> still working on better error messages.
>>
>> Updated patchset is attached.  This patchset includes:
>>
>> * Improved error handling by Nikita, revised by me,
>> * Code beautification.
>>
>> So, I'm going to commit this again.  This time seriously :)
> 
> I'm really sorry for confusing people, but I've one more revision.
> This is my first time attempting to commit such a large patch.
> 

I'm not sure what you're apologizing for, it simply shows how careful
you are when polishing such a large patch.

> Major changes are following:
>  * We find it ridiculous to save ErrorData for possible latter throw.
> Now, we either throw an error immediately or return jperError.  That
> also allows to get rid of unwanted changes in elog.c/elog.h.

OK.

>  * I decided to change behavior of jsonb_path_match() to throw as less
> errors as possible.  The reason is that it's used to implement
> potentially (patch is pending) indexable operator.  Index scan is not
> always capable to throw as many errors and sequential scan.  So, it's
> better to not introduce extra possible index scan and sequential scan
> results divergence.
> 

Hmmm, can you elaborate a bit more? Which errors were thrown before and
are not thrown with the current patch version?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Nikita Glukhov
Date:

On 28.01.2019 16:50, Tomas Vondra wrote:

On 1/28/19 1:22 AM, Alexander Korotkov wrote:
 * I decided to change behavior of jsonb_path_match() to throw as less
errors as possible.  The reason is that it's used to implement
potentially (patch is pending) indexable operator.  Index scan is not
always capable to throw as many errors and sequential scan.  So, it's
better to not introduce extra possible index scan and sequential scan
results divergence.
Hmmm, can you elaborate a bit more? Which errors were thrown before and
are not thrown with the current patch version?
In the previous version of the patch jsonb_path_match() threw error when
jsonpath did not return a singleton value, but in the last version in such cases
NULL is returned.  This problem arises because we cannot guarantee at compile
time that jsonpath expression used in jsonb_path_match() is a predicate.
Predicates by standard can return only True, False, and Unknown (errors occurred
during execution of their operands are transformed into Unknown values), so 
predicates cannot throw errors, and there are no problems with errors. 

GIN does not attempt to search non-predicate expressions, so there may be no 
problem even we throw "not a singleton" error.


Here I want to remind that ability to use predicates in the root of jsonpath
expression is an our extension to standard that was created specially for the
operator @@.  By standard predicates are allowed only in filters.  Without this
extension we are still able to rewrite @@ using @?:
jsonb @@ 'predicate'        is equivalent to
jsonb @? '$ ? (predicate)'
but such @? expression is a bit slower to execute and a bit verbose to write.

If we introduced special type 'jsonpath_predicate', then we could solve the
problem by checking the type of jsonpath expression at compile-time.


Another problem with error handling is that jsonb_path_query*() functions
always throw SQL/JSON errors and there is no easy and effective way to emulate
NULL ON ERROR behavior, which is used by default in SQL/JSON functions.  So I
think it's worth trying to add some kind of flag 'throwErrors' to
jsonb_path_query*() functions.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 29, 2019 at 2:17 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> In the previous version of the patch jsonb_path_match() threw error when
> jsonpath did not return a singleton value, but in the last version in such cases
> NULL is returned.  This problem arises because we cannot guarantee at compile
> time that jsonpath expression used in jsonb_path_match() is a predicate.
> Predicates by standard can return only True, False, and Unknown (errors occurred
> during execution of their operands are transformed into Unknown values), so
> predicates cannot throw errors, and there are no problems with errors.

Attached patchset provides description of errors suppressed.  It also
clarifies how jsonb_path_match() works.

> GIN does not attempt to search non-predicate expressions, so there may be no
> problem even we throw "not a singleton" error.

Yes, I don't insist on that.  If majority of us wants to bring "not a
singleton" error back, I don't object to it.

> Here I want to remind that ability to use predicates in the root of jsonpath
> expression is an our extension to standard that was created specially for the
> operator @@.  By standard predicates are allowed only in filters.  Without this
> extension we are still able to rewrite @@ using @?:
> jsonb @@ 'predicate'        is equivalent to
> jsonb @? '$ ? (predicate)'
> but such @? expression is a bit slower to execute and a bit verbose to write.
>
> If we introduced special type 'jsonpath_predicate', then we could solve the
> problem by checking the type of jsonpath expression at compile-time.

For me it seems that separate datatype for this kind of problem is overkill.

> Another problem with error handling is that jsonb_path_query*() functions
> always throw SQL/JSON errors and there is no easy and effective way to emulate
> NULL ON ERROR behavior, which is used by default in SQL/JSON functions.  So I
> think it's worth trying to add some kind of flag 'throwErrors' to
> jsonb_path_query*() functions.

Good idea, but let's commit basic jsonpath implementation first.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Andres Freund
Date:
Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> +    /*
> +     * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> +     * no function called inside performs data modification.
> +     */
> +    PG_TRY();
> +    {
> +        res = DirectFunctionCall2(func, ldatum, rdatum);
> +    }
> +    PG_CATCH();
> +    {
> +        int            errcode = geterrcode();
> +
> +        if (jspThrowErrors(cxt) ||
> +            ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> +            PG_RE_THROW();
> +
> +        MemoryContextSwitchTo(mcxt);
> +        FlushErrorState();
> +
> +        return jperError;
> +    }
> +    PG_END_TRY();

FWIW, I still think this is a terrible idea and shouldn't be merged this
way. The likelihood of introducing subtle bugs seems way too high - even
if it's possibly not buggy today, who says that it's not going to be in
the future?

Greetings,

Andres Freund


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +     /*
> > +      * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> > +      * no function called inside performs data modification.
> > +      */
> > +     PG_TRY();
> > +     {
> > +             res = DirectFunctionCall2(func, ldatum, rdatum);
> > +     }
> > +     PG_CATCH();
> > +     {
> > +             int                     errcode = geterrcode();
> > +
> > +             if (jspThrowErrors(cxt) ||
> > +                     ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> > +                     PG_RE_THROW();
> > +
> > +             MemoryContextSwitchTo(mcxt);
> > +             FlushErrorState();
> > +
> > +             return jperError;
> > +     }
> > +     PG_END_TRY();
>
> FWIW, I still think this is a terrible idea and shouldn't be merged this
> way. The likelihood of introducing subtle bugs seems way too high - even
> if it's possibly not buggy today, who says that it's not going to be in
> the future?

I'm probably not yet understanding all the risks this code have.  So far I see:
1) One of functions called here performs database modification, while
it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
Could you complete this list?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +     /*
> > +      * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> > +      * no function called inside performs data modification.
> > +      */
> > +     PG_TRY();
> > +     {
> > +             res = DirectFunctionCall2(func, ldatum, rdatum);
> > +     }
> > +     PG_CATCH();
> > +     {
> > +             int                     errcode = geterrcode();
> > +
> > +             if (jspThrowErrors(cxt) ||
> > +                     ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> > +                     PG_RE_THROW();
> > +
> > +             MemoryContextSwitchTo(mcxt);
> > +             FlushErrorState();
> > +
> > +             return jperError;
> > +     }
> > +     PG_END_TRY();

BTW, this code also looks... useless.  I can't see whole numeric.c
throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Andres Freund
Date:
On 2019-01-29 04:17:33 +0300, Alexander Korotkov wrote:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > +     /*
> > > +      * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> > > +      * no function called inside performs data modification.
> > > +      */
> > > +     PG_TRY();
> > > +     {
> > > +             res = DirectFunctionCall2(func, ldatum, rdatum);
> > > +     }
> > > +     PG_CATCH();
> > > +     {
> > > +             int                     errcode = geterrcode();
> > > +
> > > +             if (jspThrowErrors(cxt) ||
> > > +                     ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> > > +                     PG_RE_THROW();
> > > +
> > > +             MemoryContextSwitchTo(mcxt);
> > > +             FlushErrorState();
> > > +
> > > +             return jperError;
> > > +     }
> > > +     PG_END_TRY();
> >
> > FWIW, I still think this is a terrible idea and shouldn't be merged this
> > way. The likelihood of introducing subtle bugs seems way too high - even
> > if it's possibly not buggy today, who says that it's not going to be in
> > the future?
> 
> I'm probably not yet understanding all the risks this code have.  So far I see:

I find these *more* than sufficient to not go to the PG_TRY/CATCH
approach.


> 1) One of functions called here performs database modification, while
> it wasn't suppose to.  So, it becomes not safe to skip subtransaction.

It's not just data modifications. Even just modifying some memory
structures that'd normally be invalidated by an xact abort's
invalidation processing isn't safe.


> 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.

It'd e.g. not surprise me very much if some OOM would end up translating
to ERRCODE_DATA_EXCEPTION, because some library function returned an
error due to ENOMEM.


> Could you complete this list?

3) The expression changed the current expression context, GUCs or any
   other such global variable. Without a proper subtrans reset this
   state isn't reverted.
4) The function acquires an LWLOCK, buffer reference, anything resowner
   owned. Skipping subtrans reset, that's not released in that
   moment. That's going to lead to potential hard deadlocks.
99) sigsetjmp is actually pretty expensive.

Greetings,

Andres Freund


Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
>> FWIW, I still think this is a terrible idea and shouldn't be merged this
>> way. The likelihood of introducing subtle bugs seems way too high - even
>> if it's possibly not buggy today, who says that it's not going to be in
>> the future?

> I'm probably not yet understanding all the risks this code have.  So far I see:
> 1) One of functions called here performs database modification, while
> it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> Could you complete this list?

Sure: every errcode we have is unsafe to treat this way.

The backend coding rule from day one has been that a thrown error requires
(sub)transaction cleanup to be done to make sure that things are back in a
good state.  You can *not* just decide that it's okay to ignore that,
especially not when invoking code outside the immediate area of what
you're doing.

As a counterexample, for any specific errcode you might claim is safe,
a plpgsql function might do something that requires cleanup (which is
not equal to "does data modification") and then throw that errcode.

You might argue that this code will only ever be used to call certain
functions in numeric.c and you've vetted every line of code that those
functions can reach ... but even to say that is to expose how fragile
the argument is.  Every time anybody touched numeric.c, or any code
reachable from there, we'd have to wonder whether we just introduced
a failure hazard for jsonpath.  That's not maintainable.  It's even
less maintainable if anybody decides they'd like to insert extension
hooks into any of that code.  I rather imagine that this could be
broken today by an ErrorContextCallback function, for example.
(That is, even today, "vetting every line" has to include vetting every
errcontext subroutine in the system, along with everything it can call.
Plus equivalent code in external PLs and so on.)

We've rejected code like this every time it was submitted to date,
and I don't see a reason to change that policy for jsonpath.

            regards, tom lane


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 29, 2019 at 4:30 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > +     /*
> > > +      * It is safe to use here PG_TRY/PG_CATCH without subtransaction because
> > > +      * no function called inside performs data modification.
> > > +      */
> > > +     PG_TRY();
> > > +     {
> > > +             res = DirectFunctionCall2(func, ldatum, rdatum);
> > > +     }
> > > +     PG_CATCH();
> > > +     {
> > > +             int                     errcode = geterrcode();
> > > +
> > > +             if (jspThrowErrors(cxt) ||
> > > +                     ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> > > +                     PG_RE_THROW();
> > > +
> > > +             MemoryContextSwitchTo(mcxt);
> > > +             FlushErrorState();
> > > +
> > > +             return jperError;
> > > +     }
> > > +     PG_END_TRY();
>
> BTW, this code also looks... useless.  I can't see whole numeric.c
> throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

Oh, sorry.  I've missed we have ERRCODE_TO_CATEGORY() here.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Jan 29, 2019 at 4:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > On Tue, Jan 29, 2019 at 4:03 AM Andres Freund <andres@anarazel.de> wrote:
> >> FWIW, I still think this is a terrible idea and shouldn't be merged this
> >> way. The likelihood of introducing subtle bugs seems way too high - even
> >> if it's possibly not buggy today, who says that it's not going to be in
> >> the future?
>
> > I'm probably not yet understanding all the risks this code have.  So far I see:
> > 1) One of functions called here performs database modification, while
> > it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> > 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> > might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> > Could you complete this list?
>
> Sure: every errcode we have is unsafe to treat this way.
>
> The backend coding rule from day one has been that a thrown error requires
> (sub)transaction cleanup to be done to make sure that things are back in a
> good state.  You can *not* just decide that it's okay to ignore that,
> especially not when invoking code outside the immediate area of what
> you're doing.
>
> As a counterexample, for any specific errcode you might claim is safe,
> a plpgsql function might do something that requires cleanup (which is
> not equal to "does data modification") and then throw that errcode.
>
> You might argue that this code will only ever be used to call certain
> functions in numeric.c and you've vetted every line of code that those
> functions can reach ... but even to say that is to expose how fragile
> the argument is.  Every time anybody touched numeric.c, or any code
> reachable from there, we'd have to wonder whether we just introduced
> a failure hazard for jsonpath.  That's not maintainable.  It's even
> less maintainable if anybody decides they'd like to insert extension
> hooks into any of that code.  I rather imagine that this could be
> broken today by an ErrorContextCallback function, for example.
> (That is, even today, "vetting every line" has to include vetting every
> errcontext subroutine in the system, along with everything it can call.
> Plus equivalent code in external PLs and so on.)
>
> We've rejected code like this every time it was submitted to date,
> and I don't see a reason to change that policy for jsonpath.

Tom, Andres thank you for the explanation.  More than clear for now.
Will search for workaround.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> +             if (jspThrowErrors(cxt) ||
> +                     ERRCODE_TO_CATEGORY(errcode) != ERRCODE_DATA_EXCEPTION)
> +                     PG_RE_THROW();

> BTW, this code also looks... useless.  I can't see whole numeric.c
> throwing ERRCODE_DATA_EXCEPTION.   Nikita, what do we mean here?

I think what this code is claiming is that any errcode in the 22xxx
category is safe to trap.  Presumably what it's expecting to see is
ERRCODE_DIVISION_BY_ZERO (22012) or another error that numeric.c
could throw ... but 22xxx covers a heck of a lot of ground.

            regards, tom lane


Re: jsonpath

From
Andres Freund
Date:
Hi,

On 2019-01-28 17:31:15 -0800, Andres Freund wrote:
> On 2019-01-29 04:17:33 +0300, Alexander Korotkov wrote:
> > I'm probably not yet understanding all the risks this code have.  So far I see:
> 
> I find these *more* than sufficient to not go to the PG_TRY/CATCH
> approach.
> 
> 
> > 1) One of functions called here performs database modification, while
> > it wasn't suppose to.  So, it becomes not safe to skip subtransaction.
> 
> It's not just data modifications. Even just modifying some memory
> structures that'd normally be invalidated by an xact abort's
> invalidation processing isn't safe.
> 
> 
> > 2) ERRCODE_DATA_EXCEPTION was thrown for unexpected reason.  So, it
> > might appear that ERRCODE_DATA_EXCEPTION is not safe to ignore.
> 
> It'd e.g. not surprise me very much if some OOM would end up translating
> to ERRCODE_DATA_EXCEPTION, because some library function returned an
> error due to ENOMEM.
> 
> 
> > Could you complete this list?
> 
> 3) The expression changed the current expression context, GUCs or any
>    other such global variable. Without a proper subtrans reset this
>    state isn't reverted.
> 4) The function acquires an LWLOCK, buffer reference, anything resowner
>    owned. Skipping subtrans reset, that's not released in that
>    moment. That's going to lead to potential hard deadlocks.
> 99) sigsetjmp is actually pretty expensive.

And even if you, to address Tom's point about plpgsql, had a category
that could only be thrown by core code, and addressed 3) and 4) by
carefully and continuously auditing code, you'd still have the issue of

5) you'd likely leak memory at potentially prodiguous rate...

Greetings,

Andres Freund


Re: jsonpath

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> And even if you, to address Tom's point about plpgsql, had a category
> that could only be thrown by core code,

Actually, that wasn't quite what my point was there.  Even if the error
that is thrown is perfectly safe and was thrown from a known spot in the
core code, elog.c will call ErrorContextCallback functions on the way out,
*before* longjmp'ing back to your code.  So, if PL foo executes a SQL
command that includes jsonpath operations, it's possible for PL foo's
error context reporting code to break whatever assumptions you thought
were safe to make.

Admittedly, a wise PL designer would refrain from assuming too much about
what they can get away with doing in an error context callback.  But
that very same caution should persuade us not to assume too much about
what actually has happened during error processing.

            regards, tom lane


Re: jsonpath

From
Andres Freund
Date:
Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> +/*
> + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> + * Actual datatype (returned in 'typid', 'typmod') is determined by
> + * presence of date/time/zone components in the format string.
> + */
> +Datum
> +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> +            int32 *typmod, int *tz)

Given other to_<type> functions being SQL callable, I'm not a fan of the
naming here.

> +{
> +    struct pg_tm tm;
> +    fsec_t        fsec;
> +    int            fprec = 0;
> +    int            flags;
> +
> +    do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> +
> +    *typmod = fprec ? fprec : -1;    /* fractional part precision */
> +    *tz = 0;
> +
> +    if (flags & DCH_DATED)
> +    {
> +        if (flags & DCH_TIMED)
> +        {
> +            if (flags & DCH_ZONED)
> +            {
> +                TimestampTz result;
> +
> +                if (tm.tm_zone)
> +                    tzname = (char *) tm.tm_zone;
> +
> +                if (tzname)
> +                {
> +                    int            dterr = DecodeTimezone(tzname, tz);
> +
> +                    if (dterr)
> +                        DateTimeParseError(dterr, tzname, "timestamptz");


Do we really need 6/7 indentation levels in new code?




> +jsonpath_scan.c: FLEXFLAGS = -CF -p -p

Why -CF, and why is -p repeated?


> -                JsonEncodeDateTime(buf, val, DATEOID);
> +                JsonEncodeDateTime(buf, val, DATEOID, NULL);

ISTM API changes like this ought to be done in a separate patch, to ease
review.


>  }
>  
> +
>  /*
>   * Compare two jbvString JsonbValue values, a and b.
>   *

Random WS change.


> +/*****************************INPUT/OUTPUT*********************************/

Why do we need this much code to serialize data that we initially got in
serialized manner? Couldn't we just keep the original around?


> +Datum
> +jsonpath_in(PG_FUNCTION_ARGS)
> +{

No binary input support? I'd suggest adding that, but keeping the
representation the same. Otherwise just adds unnecesary complexity for
driver authors wishing to use the binar protocol.



> +/********************Support functions for JsonPath**************************/
> +
> +/*
> + * Support macroses to read stored values
> + */

s/macroses/macros/


> +++ b/src/backend/utils/adt/jsonpath_exec.c

Gotta say, I'm unenthusiastic about yet another execution engine in some
PG subsystem.


> @@ -0,0 +1,2776 @@
> +/*-------------------------------------------------------------------------
> + *
> + * jsonpath_exec.c
> + *     Routines for SQL/JSON path execution.
> + *
> + * Copyright (c) 2019, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + *    src/backend/utils/adt/jsonpath_exec.c
> + *
> + *-------------------------------------------------------------------------
> + */

Somewhere there needs to be higher level documentation explaining how
this stuff is supposed to work on a code level.

> +
> +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> +#define jspStrictAbsenseOfErrors(cxt)    (!(cxt)->laxMode)
> +#define jspAutoUnwrap(cxt)                ((cxt)->laxMode)
> +#define jspAutoWrap(cxt)                ((cxt)->laxMode)
> +#define jspIgnoreStructuralErrors(cxt)    ((cxt)->ignoreStructuralErrors)
> +#define jspThrowErrors(cxt)                ((cxt)->throwErrors)

What's the point?


> +#define ThrowJsonPathError(code, detail) \
> +    ereport(ERROR, \
> +             (errcode(ERRCODE_ ## code), \
> +              errmsg(ERRMSG_ ## code), \
> +              errdetail detail))
> +
> +#define ThrowJsonPathErrorHint(code, detail, hint) \
> +    ereport(ERROR, \
> +             (errcode(ERRCODE_ ## code), \
> +              errmsg(ERRMSG_ ## code), \
> +              errdetail detail, \
> +              errhint hint))

These seem ill-advised, just making it harder to understand the code,
and grepping for it, without actually meaningfully simplifying anything.


> +/*
> + * Find value of jsonpath variable in a list of passing params
> + */

What is that comment supposed to mean?


> +/*
> + * Get the type name of a SQL/JSON item.
> + */
> +static const char *
> +JsonbTypeName(JsonbValue *jb)
> +{

Wasn't there pretty similar infrastructure elsewhere?



> +/*
> + * Cross-type comparison of two datetime SQL/JSON items.  If items are
> + * uncomparable, 'error' flag is set.
> + */

Sounds like it'd not raise an error, but it does in a bunch of scenarios.


> @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
>  2200N    E    ERRCODE_INVALID_XML_CONTENT                                    invalid_xml_content
>  2200S    E    ERRCODE_INVALID_XML_COMMENT                                    invalid_xml_comment
>  2200T    E    ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION                     invalid_xml_processing_instruction
> +22030    E    ERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE                        duplicate_json_object_key_value
> +22031    E    ERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION
invalid_argument_for_json_datetime_function
> +22032    E    ERRCODE_INVALID_JSON_TEXT                                      invalid_json_text
> +22033    E    ERRCODE_INVALID_JSON_SUBSCRIPT                                 invalid_json_subscript
> +22034    E    ERRCODE_MORE_THAN_ONE_JSON_ITEM                                more_than_one_json_item
> +22035    E    ERRCODE_NO_JSON_ITEM                                           no_json_item
> +22036    E    ERRCODE_NON_NUMERIC_JSON_ITEM                                  non_numeric_json_item
> +22037    E    ERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT                         non_unique_keys_in_json_object
> +22038    E    ERRCODE_SINGLETON_JSON_ITEM_REQUIRED                           singleton_json_item_required
> +22039    E    ERRCODE_JSON_ARRAY_NOT_FOUND                                   json_array_not_found
> +2203A    E    ERRCODE_JSON_MEMBER_NOT_FOUND                                  json_member_not_found
> +2203B    E    ERRCODE_JSON_NUMBER_NOT_FOUND                                  json_number_not_found
> +2203C    E    ERRCODE_JSON_OBJECT_NOT_FOUND                                  object_not_found
> +2203F    E    ERRCODE_JSON_SCALAR_REQUIRED                                   json_scalar_required
> +2203D    E    ERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS                           too_many_json_array_elements
> +2203E    E    ERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
>  too_many_json_object_members

Are all of these invented by us?


Greetings,

Andres Freund


Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

Thank you for your review!  Let me answer some points of your review
while others will be answered later.

On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +/*****************************INPUT/OUTPUT*********************************/
>
> Why do we need this much code to serialize data that we initially got in
> serialized manner? Couldn't we just keep the original around?

As I get, you concern related to fact that we have jsonpath in text
form (serialized) and convert it to the binary form (also serialized).
Yes, we should do so.  Otherwise, we would have to parse jsonpath for
each evaluation.  Basically, for the same reason we have binary
representation for jsonb.

> > +++ b/src/backend/utils/adt/jsonpath_exec.c
>
> Gotta say, I'm unenthusiastic about yet another execution engine in some
> PG subsystem.

Better ideas?  I can imagine we can evade introduction of jsonpath
datatype and turn jsonpath into executor nodes.  But I hardly can
imagine you would be more enthusiastic about that :)

> > @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
> >  2200N    E    ERRCODE_INVALID_XML_CONTENT                                    invalid_xml_content
> >  2200S    E    ERRCODE_INVALID_XML_COMMENT                                    invalid_xml_comment
> >  2200T    E    ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION                     invalid_xml_processing_instruction
> > +22030    E    ERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE                        duplicate_json_object_key_value
> > +22031    E    ERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION
invalid_argument_for_json_datetime_function
> > +22032    E    ERRCODE_INVALID_JSON_TEXT                                      invalid_json_text
> > +22033    E    ERRCODE_INVALID_JSON_SUBSCRIPT                                 invalid_json_subscript
> > +22034    E    ERRCODE_MORE_THAN_ONE_JSON_ITEM                                more_than_one_json_item
> > +22035    E    ERRCODE_NO_JSON_ITEM                                           no_json_item
> > +22036    E    ERRCODE_NON_NUMERIC_JSON_ITEM                                  non_numeric_json_item
> > +22037    E    ERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT                         non_unique_keys_in_json_object
> > +22038    E    ERRCODE_SINGLETON_JSON_ITEM_REQUIRED                           singleton_json_item_required
> > +22039    E    ERRCODE_JSON_ARRAY_NOT_FOUND                                   json_array_not_found
> > +2203A    E    ERRCODE_JSON_MEMBER_NOT_FOUND                                  json_member_not_found
> > +2203B    E    ERRCODE_JSON_NUMBER_NOT_FOUND                                  json_number_not_found
> > +2203C    E    ERRCODE_JSON_OBJECT_NOT_FOUND                                  object_not_found
> > +2203F    E    ERRCODE_JSON_SCALAR_REQUIRED                                   json_scalar_required
> > +2203D    E    ERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS                           too_many_json_array_elements
> > +2203E    E    ERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
> >  too_many_json_object_members
>
> Are all of these invented by us?

None of them are invented by us.  All of them are part of SQL 2016 standard.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Andres Freund
Date:
Hi,

On 2019-01-30 07:34:00 +0300, Alexander Korotkov wrote:
> Thank you for your review!  Let me answer some points of your review
> while others will be answered later.
> 
> On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > +/*****************************INPUT/OUTPUT*********************************/
> >
> > Why do we need this much code to serialize data that we initially got in
> > serialized manner? Couldn't we just keep the original around?
> 
> As I get, you concern related to fact that we have jsonpath in text
> form (serialized) and convert it to the binary form (also serialized).
> Yes, we should do so.  Otherwise, we would have to parse jsonpath for
> each evaluation.  Basically, for the same reason we have binary
> representation for jsonb.

But why can't we keep the text around, instead of having all of that
code for converting back?


> > > +++ b/src/backend/utils/adt/jsonpath_exec.c
> >
> > Gotta say, I'm unenthusiastic about yet another execution engine in some
> > PG subsystem.
> 
> Better ideas?  I can imagine we can evade introduction of jsonpath
> datatype and turn jsonpath into executor nodes.  But I hardly can
> imagine you would be more enthusiastic about that :)

Not executor nodes, I think it could be possible to put it into the
expression evaluation codepaths, but it's probably too different to fit
well (would get you JIT compilation of the whole thing tho).

> > > @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
> > >  2200N    E    ERRCODE_INVALID_XML_CONTENT                                    invalid_xml_content
> > >  2200S    E    ERRCODE_INVALID_XML_COMMENT                                    invalid_xml_comment
> > >  2200T    E    ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION                     invalid_xml_processing_instruction
> > > +22030    E    ERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE                        duplicate_json_object_key_value
> > > +22031    E    ERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION
invalid_argument_for_json_datetime_function
> > > +22032    E    ERRCODE_INVALID_JSON_TEXT                                      invalid_json_text
> > > +22033    E    ERRCODE_INVALID_JSON_SUBSCRIPT                                 invalid_json_subscript
> > > +22034    E    ERRCODE_MORE_THAN_ONE_JSON_ITEM                                more_than_one_json_item
> > > +22035    E    ERRCODE_NO_JSON_ITEM                                           no_json_item
> > > +22036    E    ERRCODE_NON_NUMERIC_JSON_ITEM                                  non_numeric_json_item
> > > +22037    E    ERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT                         non_unique_keys_in_json_object
> > > +22038    E    ERRCODE_SINGLETON_JSON_ITEM_REQUIRED                           singleton_json_item_required
> > > +22039    E    ERRCODE_JSON_ARRAY_NOT_FOUND                                   json_array_not_found
> > > +2203A    E    ERRCODE_JSON_MEMBER_NOT_FOUND                                  json_member_not_found
> > > +2203B    E    ERRCODE_JSON_NUMBER_NOT_FOUND                                  json_number_not_found
> > > +2203C    E    ERRCODE_JSON_OBJECT_NOT_FOUND                                  object_not_found
> > > +2203F    E    ERRCODE_JSON_SCALAR_REQUIRED                                   json_scalar_required
> > > +2203D    E    ERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS                           too_many_json_array_elements
> > > +2203E    E    ERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
> > >  too_many_json_object_members
> >
> > Are all of these invented by us?
> 
> None of them are invented by us.  All of them are part of SQL 2016 standard.

Cool.

Greetings,

Andres Freund


Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Jan 30, 2019 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-30 07:34:00 +0300, Alexander Korotkov wrote:
> > Thank you for your review!  Let me answer some points of your review
> > while others will be answered later.
> >
> > On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > > > +/*****************************INPUT/OUTPUT*********************************/
> > >
> > > Why do we need this much code to serialize data that we initially got in
> > > serialized manner? Couldn't we just keep the original around?
> >
> > As I get, you concern related to fact that we have jsonpath in text
> > form (serialized) and convert it to the binary form (also serialized).
> > Yes, we should do so.  Otherwise, we would have to parse jsonpath for
> > each evaluation.  Basically, for the same reason we have binary
> > representation for jsonb.
>
> But why can't we keep the text around, instead of having all of that
> code for converting back?

Yeah, that's possible.  But now converting back to string is one of ways to test that jsonpath parsing is correct.  If we remove conversion back to string, this possibility would be also removed.  Also, we would loose way to normalize jsonpath, which is probably not that important.  As well it's generally ugly redundancy.  So, I can't say I like idea to save few hundreds lines of codes for this price.

> > > > +++ b/src/backend/utils/adt/jsonpath_exec.c
> > >
> > > Gotta say, I'm unenthusiastic about yet another execution engine in some
> > > PG subsystem.
> >
> > Better ideas?  I can imagine we can evade introduction of jsonpath
> > datatype and turn jsonpath into executor nodes.  But I hardly can
> > imagine you would be more enthusiastic about that :)
>
> Not executor nodes, I think it could be possible to put it into the
> expression evaluation codepaths, but it's probably too different to fit
> well (would get you JIT compilation of the whole thing tho).

Consider given example. We need to check some predicate for each JSON item, where predicate is given by expression and set of items is produced by another expression.  In order to fit this into expression evaluation, we probably need some kind of lamda functions there.  It seems unlikely for me that we want to implement that.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Michael Paquier
Date:
On Tue, Jan 29, 2019 at 04:51:46AM +0300, Alexander Korotkov wrote:
> Oh, sorry.  I've missed we have ERRCODE_TO_CATEGORY() here.

Note: patch set moved to next CF, still waiting on author.
--
Michael

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

Attached is revised version of jsonpath.  Assuming that jsonpath have
problem places, I decided to propose partial implementation.
Following functionality was cut from jsonpath:
1) Support of datetime datatypes.  Besides error suppression, this
functionality have troubles with timezones.  According to standard we
should support timezones in jsonpath expressions.  But that would
prevent jsonpath functions from being immutable, that in turn limits
their usage in expression indexes.
2) Suppression of numeric errors.  I will post it as a separate patch.
Pushing this even this partial implementation of jsonpath to PG 12 is
still very useful.  Also it will simplify a lot pushing other parts of
SQL/JSON to future releases.

Besides this changes, there is some refactoring and code beautification.

On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +/*
> > + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> > + * Actual datatype (returned in 'typid', 'typmod') is determined by
> > + * presence of date/time/zone components in the format string.
> > + */
> > +Datum
> > +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> > +                     int32 *typmod, int *tz)
>
> Given other to_<type> functions being SQL callable, I'm not a fan of the
> naming here.

I've removed that for now.

> > +{
> > +     struct pg_tm tm;
> > +     fsec_t          fsec;
> > +     int                     fprec = 0;
> > +     int                     flags;
> > +
> > +     do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> > +
> > +     *typmod = fprec ? fprec : -1;   /* fractional part precision */
> > +     *tz = 0;
> > +
> > +     if (flags & DCH_DATED)
> > +     {
> > +             if (flags & DCH_TIMED)
> > +             {
> > +                     if (flags & DCH_ZONED)
> > +                     {
> > +                             TimestampTz result;
> > +
> > +                             if (tm.tm_zone)
> > +                                     tzname = (char *) tm.tm_zone;
> > +
> > +                             if (tzname)
> > +                             {
> > +                                     int                     dterr = DecodeTimezone(tzname, tz);
> > +
> > +                                     if (dterr)
> > +                                             DateTimeParseError(dterr, tzname, "timestamptz");
>
>
> Do we really need 6/7 indentation levels in new code?

Also, removed that.

> > +jsonpath_scan.c: FLEXFLAGS = -CF -p -p
>
> Why -CF, and why is -p repeated?

BTW, for our SQL grammar we have

> scan.c: FLEXFLAGS = -CF -p -p

Is it kind of default?

> > -                             JsonEncodeDateTime(buf, val, DATEOID);
> > +                             JsonEncodeDateTime(buf, val, DATEOID, NULL);
>
> ISTM API changes like this ought to be done in a separate patch, to ease
> review.

Also, removed that for now.

> >  }
> >
> > +
> >  /*
> >   * Compare two jbvString JsonbValue values, a and b.
> >   *
>
> Random WS change.

Reverted

> No binary input support? I'd suggest adding that, but keeping the
> representation the same. Otherwise just adds unnecesary complexity for
> driver authors wishing to use the binar protocol.

Binary support is added.  I decided to make it in jsonb manner.
Format version 1 is text format, but it's possible we would have other
versions in future.

> > +/********************Support functions for JsonPath**************************/
> > +
> > +/*
> > + * Support macroses to read stored values
> > + */
>
> s/macroses/macros/

Fixed.

> > @@ -0,0 +1,2776 @@
> > +/*-------------------------------------------------------------------------
> > + *
> > + * jsonpath_exec.c
> > + *    Routines for SQL/JSON path execution.
> > + *
> > + * Copyright (c) 2019, PostgreSQL Global Development Group
> > + *
> > + * IDENTIFICATION
> > + *   src/backend/utils/adt/jsonpath_exec.c
> > + *
> > + *-------------------------------------------------------------------------
> > + */
>
> Somewhere there needs to be higher level documentation explaining how
> this stuff is supposed to work on a code level.

High level comments are added to jsonpath.c (about jsonpath binary
representation) jsonpath_exec.c (about jsonpath execution).

> > +
> > +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> > +#define jspStrictAbsenseOfErrors(cxt)        (!(cxt)->laxMode)
> > +#define jspAutoUnwrap(cxt)                           ((cxt)->laxMode)
> > +#define jspAutoWrap(cxt)                             ((cxt)->laxMode)
> > +#define jspIgnoreStructuralErrors(cxt)       ((cxt)->ignoreStructuralErrors)
> > +#define jspThrowErrors(cxt)                          ((cxt)->throwErrors)
>
> What's the point?

These are convenience macros, which improves code readability.  For
instance, instead of direct checking for laxMode, we check for
particular aspects of its behavior.  For code reader, it becomes
easier to understand why do we behave one way or another.

> > +#define ThrowJsonPathError(code, detail) \
> > +     ereport(ERROR, \
> > +                      (errcode(ERRCODE_ ## code), \
> > +                       errmsg(ERRMSG_ ## code), \
> > +                       errdetail detail))
> > +
> > +#define ThrowJsonPathErrorHint(code, detail, hint) \
> > +     ereport(ERROR, \
> > +                      (errcode(ERRCODE_ ## code), \
> > +                       errmsg(ERRMSG_ ## code), \
> > +                       errdetail detail, \
> > +                       errhint hint))
>
> These seem ill-advised, just making it harder to understand the code,
> and grepping for it, without actually meaningfully simplifying anything.

Removed.  Instead, I've introduced RETURN_ERROR() macro, which either
returns jperError or issues ereport given in the argument.

> > +/*
> > + * Find value of jsonpath variable in a list of passing params
> > + */
>
> What is that comment supposed to mean?

Comment is rephrased.

> > +/*
> > + * Get the type name of a SQL/JSON item.
> > + */
> > +static const char *
> > +JsonbTypeName(JsonbValue *jb)
> > +{
>
> Wasn't there pretty similar infrastructure elsewhere?

Yes, but it wasn't exposed.  Moved to jsonb.c

 > > +/*
> > + * Cross-type comparison of two datetime SQL/JSON items.  If items are
> > + * uncomparable, 'error' flag is set.
> > + */
>
> Sounds like it'd not raise an error, but it does in a bunch of scenarios.

Removed as well.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tomas Vondra
Date:
Hi,

On 2/24/19 10:03 AM, Alexander Korotkov wrote:
> Hi!
> 
> Attached is revised version of jsonpath.  Assuming that jsonpath have
> problem places, I decided to propose partial implementation.
> Following functionality was cut from jsonpath:
> 1) Support of datetime datatypes.  Besides error suppression, this
> functionality have troubles with timezones.  According to standard we
> should support timezones in jsonpath expressions.  But that would
> prevent jsonpath functions from being immutable, that in turn limits
> their usage in expression indexes.

Assuming we get the patch committed without the datetime stuff now, what
does that mean for the future? Does that mean we'll be unable to extend
it to support datetime in the future, or what? If we define the jsonpath
functions as immutable now, people may create indexes - which means we
won't be able to downgrade it to stable later.

So, what's the plan here? The only thing I can think of is having two
sets of functions - an immutable one, prohibiting datetime expressions,
and stable that can't be used for indexes etc.

> 2) Suppression of numeric errors.  I will post it as a separate patch.
> Pushing this even this partial implementation of jsonpath to PG 12 is
> still very useful.  Also it will simplify a lot pushing other parts of
> SQL/JSON to future releases.
> 

+1 to push at least partial (but still useful) subset, instead of just
bumping the patch to PG13


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Feb 24, 2019 at 2:44 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 2/24/19 10:03 AM, Alexander Korotkov wrote:
> > Attached is revised version of jsonpath.  Assuming that jsonpath have
> > problem places, I decided to propose partial implementation.
> > Following functionality was cut from jsonpath:
> > 1) Support of datetime datatypes.  Besides error suppression, this
> > functionality have troubles with timezones.  According to standard we
> > should support timezones in jsonpath expressions.  But that would
> > prevent jsonpath functions from being immutable, that in turn limits
> > their usage in expression indexes.
>
> Assuming we get the patch committed without the datetime stuff now, what
> does that mean for the future? Does that mean we'll be unable to extend
> it to support datetime in the future, or what? If we define the jsonpath
> functions as immutable now, people may create indexes - which means we
> won't be able to downgrade it to stable later.
>
> So, what's the plan here? The only thing I can think of is having two
> sets of functions - an immutable one, prohibiting datetime expressions,
> and stable that can't be used for indexes etc.

Reasonable question.  As I understand, not datetime support itself
making jsonpath functions not immutable, but implicit cast happening
during comparison timestamp vs. timestamptz (and time vs. timetz).
So, in future immutable functions can have limited support of
datetime, where comparison of non-tz vs. tz types is restricted.  And
stable versions of functions (for instance, with '_tz' prefix) with
full datetime support.

> > 2) Suppression of numeric errors.  I will post it as a separate patch.
> > Pushing this even this partial implementation of jsonpath to PG 12 is
> > still very useful.  Also it will simplify a lot pushing other parts of
> > SQL/JSON to future releases.
> >
>
> +1 to push at least partial (but still useful) subset, instead of just
> bumping the patch to PG13

Thank you for support!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Nikita Glukhov
Date:
Attached 34th version of the patches.

1. Partial jsonpath support:  - Fixed copying of jsonb with vars jsonb_path_query() into SRF context  - Fixed error message for jsonpath vars  - Fixed file-level comment in jsonpath.c

2. Suppression of numeric errors:  Now error handling is done without PG_TRY/PG_CATCH using a bunch of internal  numeric functions with 'bool *error' flag.

3. Datetime support:  Problems with timzeones still exist.  Comparison of tz types with  non-tz types is simply disallowed.  Default integer timezone offset (not  abbreviation) can be specified with the second .datetime() argument.  Error handling also is done using internal functions.

4. Json type support:  Json support was completely refactored since v23: double compilation with  function redefinitions was replaced with passing 'isJsonb' flag to low-level  json/jsonb access functions.  Also major refactoring with introduction of struct JsonItem was made.  JsonItem is used in executor instead of raw JsonbValue.  This helps to avoid  extending of JsonbValue for datetime types and also other numeric types   (integers and floats) required by standard.

5. GIN support:  Nothing was changed since v23.


Patch 1 is what we are going to commit in PG12.
Patches 2 and 3 add code that was removed in the previous v33 version.


On 24.02.2019 15:34, Alexander Korotkov wrote:

On Sun, Feb 24, 2019 at 2:44 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
On 2/24/19 10:03 AM, Alexander Korotkov wrote:
Attached is revised version of jsonpath.  Assuming that jsonpath have
problem places, I decided to propose partial implementation.
Following functionality was cut from jsonpath:
1) Support of datetime datatypes.  Besides error suppression, this
functionality have troubles with timezones.  According to standard we
should support timezones in jsonpath expressions.  But that would
prevent jsonpath functions from being immutable, that in turn limits
their usage in expression indexes.
Assuming we get the patch committed without the datetime stuff now, what
does that mean for the future? Does that mean we'll be unable to extend
it to support datetime in the future, or what? If we define the jsonpath
functions as immutable now, people may create indexes - which means we
won't be able to downgrade it to stable later.

So, what's the plan here? The only thing I can think of is having two
sets of functions - an immutable one, prohibiting datetime expressions,
and stable that can't be used for indexes etc.
Reasonable question.  As I understand, not datetime support itself
making jsonpath functions not immutable, but implicit cast happening
during comparison timestamp vs. timestamptz (and time vs. timetz).
So, in future immutable functions can have limited support of
datetime, where comparison of non-tz vs. tz types is restricted.  And
stable versions of functions (for instance, with '_tz' prefix) with
full datetime support.
I can also offset to explicitly pass timezone info into jsonpath function using
the special user dataype encapsulating struct pg_tz.

But simple integer timezone offset can be passed now using jsonpath variables
(standard says only about integer timezone offsets; also it requires presence
of timezone offset it in the input string if the format string contain timezone 
components):

=# SELECT jsonb_path_query(    '"28-02-2019 12:34"',    '$.datetime("DD-MM-YYYY HH24:MI TZH", $tz)',    jsonb_build_object('tz', EXTRACT(TIMEZONE FROM now()))  );     jsonb_path_query       
-----------------------------"2019-02-28T12:34:00+03:00"
(1 row)

2) Suppression of numeric errors.  I will post it as a separate patch.
Pushing this even this partial implementation of jsonpath to PG 12 is
still very useful.  Also it will simplify a lot pushing other parts of
SQL/JSON to future releases.
+1 to push at least partial (but still useful) subset, instead of just
bumping the patch to PG13
See patch #2.
Thank you for support!
--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: jsonpath

From
Andres Freund
Date:
On 2019-03-01 03:36:49 +0300, Nikita Glukhov wrote:
> Patch 1 is what we are going to commit in PG12.

I think it's too early to make that determination. I think there's a
good chance, but that this needs more independent review.


Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, Mar 1, 2019 at 3:36 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> I can also offset to explicitly pass timezone info into jsonpath function using
> the special user dataype encapsulating struct pg_tz.

More interesting question is what would be the source of timezone.  If
even you encapsulate timezone in a separate datatype, the expression
will be still just stable assuming timezone is generated by stable
subexpression.  What we actually need is immutable timezone.  Day once
timezone is updated, you create new timezone version, while old
version is immutable.  Then if jsonpath has given particular *timezone
version*, it might remain immutable.  But that requires significant
rework of our timezone infrastructure.

> But simple integer timezone offset can be passed now using jsonpath variables
> (standard says only about integer timezone offsets; also it requires presence
> of timezone offset it in the input string if the format string contain timezone
> components):
>
> =# SELECT jsonb_path_query(
>      '"28-02-2019 12:34"',
>      '$.datetime("DD-MM-YYYY HH24:MI TZH", $tz)',
>      jsonb_build_object('tz', EXTRACT(TIMEZONE FROM now()))
>    );
>
>       jsonb_path_query
> -----------------------------
>  "2019-02-28T12:34:00+03:00"
> (1 row)

Standard specifies fixed offset to be given for *particular datetime*.
For instance, if json contains offset in separate attribute or
whatever, then it's OK to use such two-arguments .datetime() method.
But that seems quite narrow use case.

Standard doesn't mean you get fixed offset extracted from "now()" and
apply it to random datetimes in your json collection.  That would work
correctly for real timezones only when they are fixed offsets, but
there are almost none of them!  So, that's just plain wrong, we never
should encourage users to do something like this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

On Fri, Mar 1, 2019 at 3:36 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> Attached 34th version of the patches.
>
> 1. Partial jsonpath support:
>    - Fixed copying of jsonb with vars jsonb_path_query() into SRF context
>    - Fixed error message for jsonpath vars
>    - Fixed file-level comment in jsonpath.c
>
> 2. Suppression of numeric errors:
>    Now error handling is done without PG_TRY/PG_CATCH using a bunch of internal
>    numeric functions with 'bool *error' flag.

Revised patches 1 and 2 are attached.  Changes are following

 * Small refactoring, comments adjustment and function renaming.  In
particular, I've removed "recursive" prefix from function names,
because it actually not that informative assuming header comment
explains that the whole jsonpath executor is recursive.  Also, I made
"Unwrap" suffix more clear.  Not it's distinguished what is unwrapped
target (UnwrapTarget) or result (UnwrapResult).  Also, now it's clear
that function doesn't always unwraps but has an option to do so
(OptUnwrap).
 * Some more cases are covered by regression tests.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Pavel Stehule
Date:


so 2. 3. 2019 v 6:15 odesílatel Alexander Korotkov <a.korotkov@postgrespro.ru> napsal:
Hi!

On Fri, Mar 1, 2019 at 3:36 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> Attached 34th version of the patches.
>
> 1. Partial jsonpath support:
>    - Fixed copying of jsonb with vars jsonb_path_query() into SRF context
>    - Fixed error message for jsonpath vars
>    - Fixed file-level comment in jsonpath.c
>
> 2. Suppression of numeric errors:
>    Now error handling is done without PG_TRY/PG_CATCH using a bunch of internal
>    numeric functions with 'bool *error' flag.

Revised patches 1 and 2 are attached.  Changes are following

 * Small refactoring, comments adjustment and function renaming.  In
particular, I've removed "recursive" prefix from function names,
because it actually not that informative assuming header comment
explains that the whole jsonpath executor is recursive.  Also, I made
"Unwrap" suffix more clear.  Not it's distinguished what is unwrapped
target (UnwrapTarget) or result (UnwrapResult).  Also, now it's clear
that function doesn't always unwraps but has an option to do so
(OptUnwrap).
 * Some more cases are covered by regression tests.

These patches are large, but I think so the granularity and modularity of these patches are correct.

Now, it looks very well.

Pavel


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Tomas Vondra
Date:
Hi,

Here are some initial comments from a review of the 0001 part. I plan to
do more testing on a large data set and additional round of review over
the next week. FWIW I've passed this through valgrind and the usual
battery of regression tests, and there were no issues.

I haven't looked at 0002 yet, but it seems fairly small (especially
compared to 0001).

func.sgml
=========

1) I see the changes removed <indexterm zone="functions-json"> for some
reason. Is that intentional?

2) <command>WHERE</command>

    We generally tag <literal>WHERE</literal>, not <command>.

3) Filter expressions are applied from left to right

    Perhaps s/applied/evaluated/ in reference to expressions?

4) The result of the filter expression may be true, false, or unknown.

    It's not entirely clear to me what "unknown" means here. NULL, or
    something else? There's a section in the SQL/JSON standard
    explaining this (page 83), but perhaps we should explain it a bit
    here too?

    The standard says "In the SQL/JSON data model, there are no SQL
    nulls, so Unknown is not part of the SQL/JSON data model." so I'm a
    bit confused what "unknown" references to. Maybe some example?

    Also, what happens when the result is unknown?


5) There's an example showing how to apply filter at a certain level,
using the @ variable, but what if we want to apply multiple filters at
different levels? Would it make sense to add such example?

6) ... extensions of the SQL/JSON standard

    I'm not sure what "extension" is here. Is that an extension defined
    in the SQL standard, or an additional PostgreSQL functionality not
    described in the standard? (I assume the latter, just checking.)


7) There are references to "SQL/JSON sequences" without any explanation
what it means. Maybe I'm missing something obvious, though.

8) Implicit unwrapping in the lax mode is not performed in the following
cases:

    I suggest to reword it like this:

In the lax mode, implicit unwrapping is not performed when:

9) We're not adding the datetime() method for now, due to the issues
with timezones. I wonder if we should add a note why it's missing to the
docs ...

10) I'm a bit puzzled though, because the standard says this in the
description of type() function on page 77

    If I is a datetime, then “date”, “time without time zone”, “time
    with time zone”, “timestamp without time zone”, or “timestamp with
    time zone”, as appropriate.

But I see our type() function does not return anything like that (which
I presume is independent of timezone stuff). But I see jsonb.h has no
concept of datetime values, and the standard actually says this in the
datetime() section

    JSON has no datetime types. Datetime values are most likely stored
    in character strings.

Considering this, the type() section makes little sense, no?


jsonb_util.c
============

I see we're now handling NaN values in convertJsonbScalar(). Isn't it
actually a bug that we don't do this already? Or is it not needed for
some reason?

jsonpath.c
==========

I suppose this should say "jsonpath version number" instead?

    elog(ERROR, "unsupported jsonb version number %d", version);


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Andrew Dunstan
Date:
On 3/3/19 1:08 PM, Tomas Vondra wrote:
>
>
> jsonb_util.c
> ============
>
> I see we're now handling NaN values in convertJsonbScalar(). Isn't it
> actually a bug that we don't do this already? Or is it not needed for
> some reason?
>

JSON standard numerics don't support NaN, Infinity etc., so I assume
this can only happen in a jsonpath expression being converted to a jsonb
value. If so, the new section  should contain a comment to that effect,
otherwise it will be quite confusing.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Tomas Vondra
Date:
A bunch of additional comments, after looking at the patch a bit today.
All are mostly minor, and sometime perhaps a matter of preference.


1) There's a mismatch between the comment and actual function name for
jsonb_path_match_opr and jsonb_path_exists_opr(). The comments say
"_novars" instead.

2) In a couple of switches the "default" case does a return with a
value, following elog(ERROR). So it's practically unreachable, AFAICS
it's fine without it, and we don't do this elsewhere. And I don't get
any compiler warnings if I remove it either.

Examples:

JsonbTypeName

    default:
        elog(ERROR, "unrecognized jsonb value type: %d", jbv->type);
        return "unknown";

jspOperationName

    default:
        elog(ERROR, "unrecognized jsonpath item type: %d", type);
        return NULL;

compareItems

    default:
        elog(ERROR, "unrecognized jsonpath operation: %d", op);
        return jpbUnknown;

3) jsonpath_send is using makeStringInfo() for a value that is not
returned - IMHO it should use regular stack-allocated variable and use
initStringInfo() instead

4) the version number should be defined/used as a constant, not as a
magic constant somewhere in the code

5) Why does jsonPathToCstring do this?

    appendBinaryStringInfo(out, "strict ", 7);

Why not to use regular appendStringInfoString()? What am I missing?

6) comment typo: "Aling StringInfo"

7) alignStringInfoInt() should explain why we need this and why INTALIGN
is the right alignment.

8) I'm a bit puzzled by what flattenJsonPathParseItem does with 'next'

    I don't quite understand what it's doing with 'next' value?

    /*
     * Actual value will be recorded later, after next and children
     * processing.
     */
    appendBinaryStringInfo(buf,
                           (char *) &next, /* fake value */
                           sizeof(next));

Perhaps a comment explaining it (why we need a fake value at all?) would
be a good idea here.

9) I see printJsonPathItem is only calling check_stack_depth while
flattenJsonPathParseItem also calls CHECK_INTERRUPTS. Why the
difference, considering they seem about equally expensive?

10) executeNumericItemMethod is missing a comment (unlike the other
executeXXX functions)

11) Wording of some of the error messages in the execute methods seems a
bit odd. For example executeNumericItemMethod may complain that it

    ... is applied to not a numeric value

but perhaps a more natural wording would be

    ... is applied to a non-numeric value

And similarly for the other execute methods. But I'm not a native
speaker, so perhaps the original wording is just fine.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Robert Haas
Date:
On Mon, Mar 4, 2019 at 6:27 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> 11) Wording of some of the error messages in the execute methods seems a
> bit odd. For example executeNumericItemMethod may complain that it
>
>         ... is applied to not a numeric value
>
> but perhaps a more natural wording would be
>
>         ... is applied to a non-numeric value
>
> And similarly for the other execute methods. But I'm not a native
> speaker, so perhaps the original wording is just fine.

As a native speaker I can confirm that the first wording is definitely
not OK.  The second one is tolerable, but I wonder if there is
something better, like "can only be applied to a numeric value" or
maybe there's a way to rephrase it so that we complain about the
non-numeric value itself rather than the application, e.g. ERROR:
json_frobnitz can only frob numeric values or ERROR: argument to
json_frobnitz must be numeric.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: jsonpath

From
Nikita Glukhov
Date:

Attached 36th version of the patches.


On 03.03.2019 21:08, Tomas Vondra wrote:

Hi,

Here are some initial comments from a review of the 0001 part. I plan to
do more testing on a large data set and additional round of review over
the next week. FWIW I've passed this through valgrind and the usual
battery of regression tests, and there were no issues.
Thanks again for your review.
I haven't looked at 0002 yet, but it seems fairly small (especially
compared to 0001).

func.sgml
=========

1) I see the changes removed <indexterm zone="functions-json"> for some
reason. Is that intentional?
Fixed.
2) <command>WHERE</command>
   We generally tag <literal>WHERE</literal>, not <command>.
Fixed.
3) Filter expressions are applied from left to right
   Perhaps s/applied/evaluated/ in reference to expressions?
Fixed.
4) The result of the filter expression may be true, false, or unknown.
   It's not entirely clear to me what "unknown" means here. NULL, or   something else? There's a section in the SQL/JSON standard   explaining this (page 83), but perhaps we should explain it a bit   here too?
   The standard says "In the SQL/JSON data model, there are no SQL   nulls, so Unknown is not part of the SQL/JSON data model." so I'm a   bit confused what "unknown" references to. Maybe some example?
   Also, what happens when the result is unknown?
"unknown" refers here to ordinary three-valued logical Unknown, which is
represented in SQL by NULL.

JSON path expressions return sequences of SQL/JSON items, which are defined by
SQL/JSON data model.  But JSON path predicates (logical expressions), which are
used in filters, return three-valued logical values: False, True, or Unknown.

Filters accept only items for which the filter predicate returned True. False
and Unknown results are skipped.

Unknown can be checked with IS UNKNOWN predicate:

SELECT jsonb_path_query_array('[1, "1", true, null]',                             '$[*] ? ((@ < 2) is unknown)');jsonb_path_query_array 
------------------------["1", true]
(1 row)

Comparison of JSON nulls to non-nulls returns always False, not Unknown (see
SQL/JSON data model). Comparison of non-comparable items return Unknown.

5) There's an example showing how to apply filter at a certain level,
using the @ variable, but what if we want to apply multiple filters at
different levels? Would it make sense to add such example?
Examples were added.

Filters can be nested, but by the standard it is impossible to reference item
of the outer filter, because @ always references current item of the innermost
filter.  We have additional patch with a simple jsonpath syntax extension for 
this case -- @N:@0, like @, references innermost item@1 references item one level upper@2 references item two levels upper

For example, selecting all objects from array
'[{"vals": [1,2,3], "val": 2}, {"vals": [4,5], "val": 6}]'

having in their .vals[] element greater than their .val field:
'$[*] ? (@.vals[*] ? (@ > @1.val))'

It is impossible to do this by the standard in the single jsonpath expression.


Also there is idea to use lambda expressions with ECMAScript 6 syntax in
filters and user method (see below):

'$[*] ? (obj => obj.vals[*] ? (val => val > obj.val))'

I already have a patch implementing lambda expressions, which were necessary
for implementaion of built-in/user-written functions and methods like
.map(), reduce(), max().  I can post it again if it is interesting.

6) ... extensions of the SQL/JSON standard
   I'm not sure what "extension" is here. Is that an extension defined   in the SQL standard, or an additional PostgreSQL functionality not   described in the standard? (I assume the latter, just checking.)
Yes, this is additional functionality.

 "Writing the path as an expression is also valid"   I have moved this into SQL/JSON patch, because it is related only path   specification in SQL/JSON functions.
7) There are references to "SQL/JSON sequences" without any explanation
what it means. Maybe I'm missing something obvious, though.
SQL/JSON sequence is a sequence of SQL/JSON items.
The corresponding definition was added.
8) Implicit unwrapping in the lax mode is not performed in the following
cases:
   I suggest to reword it like this:

In the lax mode, implicit unwrapping is not performed when:
Fixed. I also decided to merge this paragraph with paragraph describing 
auto-unwrapping.
9) We're not adding the datetime() method for now, due to the issues
with timezones. I wonder if we should add a note why it's missing to the
docs ...
The corresponding note was added.
10) I'm a bit puzzled though, because the standard says this in the
description of type() function on page 77
   If I is a datetime, then “date”, “time without time zone”, “time   with time zone”, “timestamp without time zone”, or “timestamp with   time zone”, as appropriate.

But I see our type() function does not return anything like that (which
I presume is independent of timezone stuff). But I see jsonb.h has no
concept of datetime values, and the standard actually says this in the
datetime() section
   JSON has no datetime types. Datetime values are most likely stored   in character strings.

Considering this, the type() section makes little sense, no?
According to the SQL/JSON data model, SQL/JSON items can be null, string, 
boolean, numeric, and datetime.

Datetime items exists only at execution time, they are serialized into JSON
strings when the resulting SQL/JSON item is converted into jsonb. After removal
of .datetime() method, support of datetime SQL/JSON items was removed from
jsonpath executor too.

Numeric items can be of any numeric type.  In PostgreSQL we have the following
numeric datatypes: integers, floats and numeric.  But our jsonpath executor
supports now only numeric-typed items, because this is only type we can get
directly from jsonb.  Support for other numeric datatypes (float8 is most
necessary, because it is produced by .double() item method) can be added by
extending JsonbValue or by introducing struct JsonItem in executor, having
JsonbValue as a part (see patch #4 in v34).

jsonb_util.c
============

I see we're now handling NaN values in convertJsonbScalar(). Isn't it
actually a bug that we don't do this already? Or is it not needed for
some reason?
Numeric JsonbValues created outside of jsonpath executor cannot be NaN, because
this case in explicitly handled in JsonbValue-producing functions.  For example,
datum_to_jsonb() which is used in to_jsonb(), jsonb_build_array() and others
converts Inf and NaN into JSON strings.  But jsonb_plperl and jsonb_plpython
transforms do not allow NaNs (I think it is needed only for consistency of
"PL => jsonb => PL" roundtrip).

In our jsonpath executor we can produce NaNs and we should not touch them before
conversion to the resulting jsonb.  Moreover, in SQL/JSON functions numeric
SQL/JSON item can be directly converted into numeric RETURNING type.  So, I
decided to add a check for NaN to the low-level convertJsonbScalar() instead of
checking items before every call to JsonbValueToJsonb() or pushJsonbValue().
But after introduction of struct JsonItem (see patch #4) introduction there will
appropriate place for this check -- JsonItemToJsonbValue().

==========

I suppose this should say "jsonpath version number" instead?
   elog(ERROR, "unsupported jsonb version number %d", version);
Fixed.


On 05.03.2019 2:27, Tomas Vondra wrote:
A bunch of additional comments, after looking at the patch a bit today.
All are mostly minor, and sometime perhaps a matter of preference.


1) There's a mismatch between the comment and actual function name for
jsonb_path_match_opr and jsonb_path_exists_opr(). The comments say
"_novars" instead.
Fixed.
2) In a couple of switches the "default" case does a return with a
value, following elog(ERROR). So it's practically unreachable, AFAICS
it's fine without it, and we don't do this elsewhere. And I don't get
any compiler warnings if I remove it either.

Examples:

JsonbTypeName
   default:       elog(ERROR, "unrecognized jsonb value type: %d", jbv->type);       return "unknown";

jspOperationName
   default:       elog(ERROR, "unrecognized jsonpath item type: %d", type);       return NULL;

compareItems
   default:       elog(ERROR, "unrecognized jsonpath operation: %d", op);       return jpbUnknown;
It seems to be a standard practice in jsonb code, so we followed it.
3) jsonpath_send is using makeStringInfo() for a value that is not
returned - IMHO it should use regular stack-allocated variable and use
initStringInfo() instead
Fixed.
4) the version number should be defined/used as a constant, not as a
magic constant somewhere in the code
Fixed.
5) Why does jsonPathToCstring do this?
   appendBinaryStringInfo(out, "strict ", 7);

Why not to use regular appendStringInfoString()? What am I missing?
appendStringInfoString() is a bit slower than appendBinaryStringInfo() 
because to strlen() call inside it.
6) comment typo: "Aling StringInfo"
Fixed.
7) alignStringInfoInt() should explain why we need this and why INTALIGN
is the right alignment.
Comment was added.
8) I'm a bit puzzled by what flattenJsonPathParseItem does with 'next'
   I don't quite understand what it's doing with 'next' value?
   /*    * Actual value will be recorded later, after next and children    * processing.    */   appendBinaryStringInfo(buf,                          (char *) &next, /* fake value */                          sizeof(next));

Perhaps a comment explaining it (why we need a fake value at all?) would
be a good idea here.
No fake value is needed here, zero placeholder is enough. I have refactored
this using new function reserveSpaceForItemPointer().
9) I see printJsonPathItem is only calling check_stack_depth while
flattenJsonPathParseItem also calls CHECK_INTERRUPTS. Why the
difference, considering they seem about equally expensive?
CHECK_INTERRUPT() was added to printJsonPathItem() too.
10) executeNumericItemMethod is missing a comment (unlike the other
executeXXX functions)
Comment was added.
11) Wording of some of the error messages in the execute methods seems a
bit odd. For example executeNumericItemMethod may complain that it
... is applied to not a numeric value

but perhaps a more natural wording would be
... is applied to a non-numeric value

And similarly for the other execute methods. But I'm not a native
speaker, so perhaps the original wording is just fine.
Error messages were changed on the advice of Robert Haas to:
"... can only be applied to a numeric value"
-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 36th version of the patches.

Thank yo for the revision!

In the attached revision following changes are made:

> "unknown" refers here to ordinary three-valued logical Unknown, which is
> represented in SQL by NULL.
>
> JSON path expressions return sequences of SQL/JSON items, which are defined by
> SQL/JSON data model.  But JSON path predicates (logical expressions), which are
> used in filters, return three-valued logical values: False, True, or Unknown.

 * I've added short explanation of this to the documentation.
 * Removed no longer present data structures from typedefs.list of the
first patch.
 * Moved GIN support patch to number 3.  Seems to be well-isolated and
not very complex patch.  I propose to consider this to 12 too.  I
added high-level comment there, commit message and made some code
beautification.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> > Attached 36th version of the patches.
>
> Thank yo for the revision!
>
> In the attached revision following changes are made:
>
> > "unknown" refers here to ordinary three-valued logical Unknown, which is
> > represented in SQL by NULL.
> >
> > JSON path expressions return sequences of SQL/JSON items, which are defined by
> > SQL/JSON data model.  But JSON path predicates (logical expressions), which are
> > used in filters, return three-valued logical values: False, True, or Unknown.
>
>  * I've added short explanation of this to the documentation.
>  * Removed no longer present data structures from typedefs.list of the
> first patch.
>  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> not very complex patch.  I propose to consider this to 12 too.  I
> added high-level comment there, commit message and made some code
> beautification.

I think patches 1 and 2 are in committable shape (I reached Tomas
off-list, he doesn't have more notes regarding them).  While patch 3
requires more review.

I'm going to push 1 and 2 if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> > > Attached 36th version of the patches.
> >
> > Thank yo for the revision!
> >
> > In the attached revision following changes are made:
> >
> > > "unknown" refers here to ordinary three-valued logical Unknown, which is
> > > represented in SQL by NULL.
> > >
> > > JSON path expressions return sequences of SQL/JSON items, which are defined by
> > > SQL/JSON data model.  But JSON path predicates (logical expressions), which are
> > > used in filters, return three-valued logical values: False, True, or Unknown.
> >
> >  * I've added short explanation of this to the documentation.
> >  * Removed no longer present data structures from typedefs.list of the
> > first patch.
> >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> > not very complex patch.  I propose to consider this to 12 too.  I
> > added high-level comment there, commit message and made some code
> > beautification.
>
> I think patches 1 and 2 are in committable shape (I reached Tomas
> off-list, he doesn't have more notes regarding them).  While patch 3
> requires more review.
>
> I'm going to push 1 and 2 if no objections.

So, pushed.  Many thanks to reviewers and authors!

Remaining part I'm proposing for 12 is attached.  I appreciate review of it.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Jeff Janes
Date:
On Sat, Mar 16, 2019 at 5:36 AM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

So, pushed.  Many thanks to reviewers and authors!

I think these files have to be cleaned up by "make maintainer-clean"

./src/backend/utils/adt/jsonpath_gram.c
./src/backend/utils/adt/jsonpath_scan.c

Cheers,

Jeff

Re: jsonpath

From
Alexander Korotkov
Date:
сб, 16 мар. 2019 г., 20:52 Jeff Janes <jeff.janes@gmail.com>:
On Sat, Mar 16, 2019 at 5:36 AM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

So, pushed.  Many thanks to reviewers and authors!

I think these files have to be cleaned up by "make maintainer-clean"

./src/backend/utils/adt/jsonpath_gram.c
./src/backend/utils/adt/jsonpath_scan.c

Good catch, thanks!  Will fix.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> I think these files have to be cleaned up by "make maintainer-clean"

> ./src/backend/utils/adt/jsonpath_gram.c
> ./src/backend/utils/adt/jsonpath_scan.c

Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:

$ git status --ignored
On branch master
Your branch is up-to-date with 'origin/master'.
Ignored files:
  (use "git add -f <file>..." to include in what will be committed)

        src/backend/utils/adt/jsonpath_gram.c
        src/backend/utils/adt/jsonpath_scan.c
        src/include/utils/jsonpath_gram.h

Looks like whoever modified src/backend/Makefile's distprep target
didn't bother to read the comment.

            regards, tom lane


Re: jsonpath

From
Tom Lane
Date:
I wrote:
> Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:

Oh, and of potentially more significance: after maintainer-clean and
re-configure, make fails with

In file included from jsonpath_gram.y:24:
../../../../src/include/utils/jsonpath_scanner.h:25:33: error: utils/jsonpath_gram.h: No such file or directory

I first thought this was a problem with insufficient dependencies
allowing parallel make to do things in the wrong order, but the problem
repeats even without any parallelism.  It looks like the dependencies
have been constructed in such a way that if the symlink at
src/include/utils/jsonpath_gram.h exists but the underlying file
does not, nothing will make the underlying file.  This is pretty broken;
aside from this outright failure, it also suggests that nothing will
update that file if it exists but is out of date relative to its
sources.

Please make sure that the make rules associated with these files look
exactly like the previously-debugged rules for existing bison/flex
output files.  There are generally good reasons for every last bit
of weirdness in those.

            regards, tom lane


Re: jsonpath

From
Alexander Korotkov
Date:
сб, 16 мар. 2019 г., 21:12 Tom Lane <tgl@sss.pgh.pa.us>:
I wrote:
> Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:

Oh, and of potentially more significance: after maintainer-clean and
re-configure, make fails with

In file included from jsonpath_gram.y:24:
../../../../src/include/utils/jsonpath_scanner.h:25:33: error: utils/jsonpath_gram.h: No such file or directory

I first thought this was a problem with insufficient dependencies
allowing parallel make to do things in the wrong order, but the problem
repeats even without any parallelism.  It looks like the dependencies
have been constructed in such a way that if the symlink at
src/include/utils/jsonpath_gram.h exists but the underlying file
does not, nothing will make the underlying file.  This is pretty broken;
aside from this outright failure, it also suggests that nothing will
update that file if it exists but is out of date relative to its
sources.

Please make sure that the make rules associated with these files look
exactly like the previously-debugged rules for existing bison/flex
output files.  There are generally good reasons for every last bit
of weirdness in those.

Uh, I didn't check that carefully enough.  Thank you for the explanation.  Will fix.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Pavel Stehule
Date:


so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov <a.korotkov@postgrespro.ru> napsal:
On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> > > Attached 36th version of the patches.
> >
> > Thank yo for the revision!
> >
> > In the attached revision following changes are made:
> >
> > > "unknown" refers here to ordinary three-valued logical Unknown, which is
> > > represented in SQL by NULL.
> > >
> > > JSON path expressions return sequences of SQL/JSON items, which are defined by
> > > SQL/JSON data model.  But JSON path predicates (logical expressions), which are
> > > used in filters, return three-valued logical values: False, True, or Unknown.
> >
> >  * I've added short explanation of this to the documentation.
> >  * Removed no longer present data structures from typedefs.list of the
> > first patch.
> >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
> > not very complex patch.  I propose to consider this to 12 too.  I
> > added high-level comment there, commit message and made some code
> > beautification.
>
> I think patches 1 and 2 are in committable shape (I reached Tomas
> off-list, he doesn't have more notes regarding them).  While patch 3
> requires more review.
>
> I'm going to push 1 and 2 if no objections.

So, pushed.  Many thanks to reviewers and authors!

Remaining part I'm proposing for 12 is attached.  I appreciate review of it.

I tested this patch and I didn't  find any issue - just I tested basic functionality and regress tests.

looks well

Pavel



------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Mar 16, 2019 at 9:39 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
> so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov <a.korotkov@postgrespro.ru> napsal:
>>
>> On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>> > On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
>> > <a.korotkov@postgrespro.ru> wrote:
>> > > On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>> > > > Attached 36th version of the patches.
>> > >
>> > > Thank yo for the revision!
>> > >
>> > > In the attached revision following changes are made:
>> > >
>> > > > "unknown" refers here to ordinary three-valued logical Unknown, which is
>> > > > represented in SQL by NULL.
>> > > >
>> > > > JSON path expressions return sequences of SQL/JSON items, which are defined by
>> > > > SQL/JSON data model.  But JSON path predicates (logical expressions), which are
>> > > > used in filters, return three-valued logical values: False, True, or Unknown.
>> > >
>> > >  * I've added short explanation of this to the documentation.
>> > >  * Removed no longer present data structures from typedefs.list of the
>> > > first patch.
>> > >  * Moved GIN support patch to number 3.  Seems to be well-isolated and
>> > > not very complex patch.  I propose to consider this to 12 too.  I
>> > > added high-level comment there, commit message and made some code
>> > > beautification.
>> >
>> > I think patches 1 and 2 are in committable shape (I reached Tomas
>> > off-list, he doesn't have more notes regarding them).  While patch 3
>> > requires more review.
>> >
>> > I'm going to push 1 and 2 if no objections.
>>
>> So, pushed.  Many thanks to reviewers and authors!
>>
>> Remaining part I'm proposing for 12 is attached.  I appreciate review of it.
>
>
> I tested this patch and I didn't  find any issue - just I tested basic functionality and regress tests.
>
> looks well

Thank you for your feedback!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sat, Mar 16, 2019 at 9:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
> > Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:
>
> Oh, and of potentially more significance: after maintainer-clean and
> re-configure, make fails with
>
> In file included from jsonpath_gram.y:24:
> ../../../../src/include/utils/jsonpath_scanner.h:25:33: error: utils/jsonpath_gram.h: No such file or directory
>
> I first thought this was a problem with insufficient dependencies
> allowing parallel make to do things in the wrong order, but the problem
> repeats even without any parallelism.  It looks like the dependencies
> have been constructed in such a way that if the symlink at
> src/include/utils/jsonpath_gram.h exists but the underlying file
> does not, nothing will make the underlying file.  This is pretty broken;
> aside from this outright failure, it also suggests that nothing will
> update that file if it exists but is out of date relative to its
> sources.
>
> Please make sure that the make rules associated with these files look
> exactly like the previously-debugged rules for existing bison/flex
> output files.  There are generally good reasons for every last bit
> of weirdness in those.

I've pushed a fix.  I hope I didn't forget anything.

BTW, it appears that windows build scripts also needs some fixup.  I'm
not very familiar with that.  I would be glad if somebody review the
attached patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Andrew Dunstan
Date:
On 3/17/19 4:03 AM, Alexander Korotkov wrote:
> On Sat, Mar 16, 2019 at 9:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wrote:
>>> Good point.  I also see jsonpath_gram.h left behind after maintainer-clean:
>> Oh, and of potentially more significance: after maintainer-clean and
>> re-configure, make fails with
>>
>> In file included from jsonpath_gram.y:24:
>> ../../../../src/include/utils/jsonpath_scanner.h:25:33: error: utils/jsonpath_gram.h: No such file or directory
>>
>> I first thought this was a problem with insufficient dependencies
>> allowing parallel make to do things in the wrong order, but the problem
>> repeats even without any parallelism.  It looks like the dependencies
>> have been constructed in such a way that if the symlink at
>> src/include/utils/jsonpath_gram.h exists but the underlying file
>> does not, nothing will make the underlying file.  This is pretty broken;
>> aside from this outright failure, it also suggests that nothing will
>> update that file if it exists but is out of date relative to its
>> sources.
>>
>> Please make sure that the make rules associated with these files look
>> exactly like the previously-debugged rules for existing bison/flex
>> output files.  There are generally good reasons for every last bit
>> of weirdness in those.
> I've pushed a fix.  I hope I didn't forget anything.
>
> BTW, it appears that windows build scripts also needs some fixup.  I'm
> not very familiar with that.  I would be glad if somebody review the
> attached patch.




Why are we installing the jsonpath_gram.h file? It's not going to be
used by anything else, is it? TBH, I'm not sure I see why we're
installing the scanner.h file either.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> Why are we installing the jsonpath_gram.h file? It's not going to be
> used by anything else, is it? TBH, I'm not sure I see why we're
> installing the scanner.h file either.

As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
we'd be better off handling that need by compiling the .l file as part
of the .y file, as we used to do with the core lexer and still do with
several others (cf comments for commit 72b1e3a21); then we wouldn't
even have to generate these files much less install them.

A quick look at jsonpath_scan.c shows that it's pretty innocent of
the tricks we've learned over the years for flex/bison portability;
in particular I see that it's #including <stdio.h> before postgres.h,
which is a no-no.  So that whole area needs more review anyway.

            regards, tom lane


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 17, 2019 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> > Why are we installing the jsonpath_gram.h file? It's not going to be
> > used by anything else, is it? TBH, I'm not sure I see why we're
> > installing the scanner.h file either.
>
> As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> we'd be better off handling that need by compiling the .l file as part
> of the .y file, as we used to do with the core lexer and still do with
> several others (cf comments for commit 72b1e3a21); then we wouldn't
> even have to generate these files much less install them.
>
> A quick look at jsonpath_scan.c shows that it's pretty innocent of
> the tricks we've learned over the years for flex/bison portability;
> in particular I see that it's #including <stdio.h> before postgres.h,
> which is a no-no.  So that whole area needs more review anyway.

Yeah, I didn't review this part of patch carefully enough (and it
seems that other reviewers too).  I'm going to write a patch revising
this part in next couple of days.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
"Jonathan S. Katz"
Date:
On 3/17/19 3:13 AM, Alexander Korotkov wrote:
> On Sat, Mar 16, 2019 at 9:39 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> so 16. 3. 2019 v 10:36 odesílatel Alexander Korotkov <a.korotkov@postgrespro.ru> napsal:
>>>
>>> On Thu, Mar 14, 2019 at 12:07 PM Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> On Sun, Mar 10, 2019 at 1:51 PM Alexander Korotkov
>>>> <a.korotkov@postgrespro.ru> wrote:
>>>>> On Wed, Mar 6, 2019 at 12:40 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>>>>>> Attached 36th version of the patches.
>>>>>
>>>>> Thank yo for the revision!
>>>>>
>>>>> In the attached revision following changes are made:
>>>>>
>>>>>> "unknown" refers here to ordinary three-valued logical Unknown, which is
>>>>>> represented in SQL by NULL.
>>>>>>
>>>>>> JSON path expressions return sequences of SQL/JSON items, which are defined by
>>>>>> SQL/JSON data model.  But JSON path predicates (logical expressions), which are
>>>>>> used in filters, return three-valued logical values: False, True, or Unknown.
>>>>>
>>>>>  * I've added short explanation of this to the documentation.
>>>>>  * Removed no longer present data structures from typedefs.list of the
>>>>> first patch.
>>>>>  * Moved GIN support patch to number 3.  Seems to be well-isolated and
>>>>> not very complex patch.  I propose to consider this to 12 too.  I
>>>>> added high-level comment there, commit message and made some code
>>>>> beautification.
>>>>
>>>> I think patches 1 and 2 are in committable shape (I reached Tomas
>>>> off-list, he doesn't have more notes regarding them).  While patch 3
>>>> requires more review.
>>>>
>>>> I'm going to push 1 and 2 if no objections.
>>>
>>> So, pushed.  Many thanks to reviewers and authors!
>>>
>>> Remaining part I'm proposing for 12 is attached.  I appreciate review of it.
>>
>>
>> I tested this patch and I didn't  find any issue - just I tested basic functionality and regress tests.
>>
>> looks well
>
> Thank you for your feedback!

Like Pavel, I did some basic testing of the patch (on current HEAD
4178d8b91c) trying out various JSON path expressions, and yes, it all
worked. I had a brief scare while testing on 4178d8b91c where initdb was
failing on the bootstrapping step, but after doing a thorough wipe of
build files and my output directory, it seems to be initializing okay.

I also did some testing of the GIN patch upthread, as the quickness of
retrieval of the data using JSON path is of course important as well.
Using a schema roughly like this:

CREATE TABLE news_feed (
    id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    data jsonb NOT NULL
);
CREATE INDEX news_feed_data_gin_idx ON news_feed USING gin(data);

I loaded in a data set of roughly 420,000 rows. Each row had all the
same keys but differing values (e.g. "length" and "content" as keys)

I tested a few different JSON path scenarios. Some of the index scans
performed way better than the equivalent sequential scans, for instance:

SELECT count(*)
FROM news_feed
WHERE data @? '$.length ? (@ == 200)';

SELECT *
FROM news_feed
WHERE data @? '$.id ? (@ == "22613cbc-d83e-4a29-8b59-3b9f5cd61825")';

Using the index outperformed the sequential scan (and parallel seq scan)
by ~10-100x based on my config + laptop hardware!

However, when I did something a little more complex, like the below:

SELECT count(*)
FROM news_feed
WHERE data @? '$.length ? (@ < 150)';

SELECT count(*)
FROM news_feed
WHERE data @? '$.content ? (@ like_regex "^Start")';

SELECT id, jsonb_path_query(data, '$.content')
FROM news_feed
WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';

I would find that the index scan performed as well as the sequential
scan. Additionally, on my laptop, the parallel sequential scan would
beat the index scan by ~2.5x in some cases.

Reading up on what the GIN patch does, this all makes sense: it's
optimized for equality, I understand there are challenges to be able to
handle inequality, regex exps, etc. And the cases where it really does
work well, it's _incredibly_ fast.

My suggestion would be adding some additional guidance in the user
documentation around how GIN works with the @@ and @? operators so they
can understand where GIN will work very well with JSON path + their data
and not be surprised when other types of JSON path queries are
performing on par with a sequential scan (or worse than a parallel seq
scan).

Thanks,

Jonathan


Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

On Sun, Mar 17, 2019 at 7:46 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> Like Pavel, I did some basic testing of the patch (on current HEAD
> 4178d8b91c) trying out various JSON path expressions, and yes, it all
> worked. I had a brief scare while testing on 4178d8b91c where initdb was
> failing on the bootstrapping step, but after doing a thorough wipe of
> build files and my output directory, it seems to be initializing okay.
>
> I also did some testing of the GIN patch upthread, as the quickness of
> retrieval of the data using JSON path is of course important as well.

Thank you very much for testing!

> Using a schema roughly like this:
>
> CREATE TABLE news_feed (
>     id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
>     data jsonb NOT NULL
> );
> CREATE INDEX news_feed_data_gin_idx ON news_feed USING gin(data);
>
> I loaded in a data set of roughly 420,000 rows. Each row had all the
> same keys but differing values (e.g. "length" and "content" as keys)
>
> I tested a few different JSON path scenarios. Some of the index scans
> performed way better than the equivalent sequential scans, for instance:
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.length ? (@ == 200)';
>
> SELECT *
> FROM news_feed
> WHERE data @? '$.id ? (@ == "22613cbc-d83e-4a29-8b59-3b9f5cd61825")';
>
> Using the index outperformed the sequential scan (and parallel seq scan)
> by ~10-100x based on my config + laptop hardware!

Great!

> However, when I did something a little more complex, like the below:
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.length ? (@ < 150)';
>
> SELECT count(*)
> FROM news_feed
> WHERE data @? '$.content ? (@ like_regex "^Start")';
>
> SELECT id, jsonb_path_query(data, '$.content')
> FROM news_feed
> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
>
> I would find that the index scan performed as well as the sequential
> scan. Additionally, on my laptop, the parallel sequential scan would
> beat the index scan by ~2.5x in some cases.

Yeah, this cases are not supported.  Did optimizer automatically
select sequential scan in this case (if not touching enable_*
variables)?  It should, because optimizer understands that GIN scan
will be bad if extract_query method failed to extract anything.

> Reading up on what the GIN patch does, this all makes sense: it's
> optimized for equality, I understand there are challenges to be able to
> handle inequality, regex exps, etc. And the cases where it really does
> work well, it's _incredibly_ fast.

Yes, for more complex cases, we need different opclasses.  For
instance, we can consider porting jsquery opclasses to PG 13.  And it
become even more important to get parametrized opclasses, because we
don't necessary want to index all the json fields in this same way.
That's another challenge for future releases.  But what we have now is
just support for some of jsonpathes for existing opclasses.

> My suggestion would be adding some additional guidance in the user
> documentation around how GIN works with the @@ and @? operators so they
> can understand where GIN will work very well with JSON path + their data
> and not be surprised when other types of JSON path queries are
> performing on par with a sequential scan (or worse than a parallel seq
> scan).

Good point.  Will do.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
"Jonathan S. Katz"
Date:
On 3/17/19 12:55 PM, Alexander Korotkov wrote:
>
>> However, when I did something a little more complex, like the below:
>>
>> SELECT count(*)
>> FROM news_feed
>> WHERE data @? '$.length ? (@ < 150)';
>>
>> SELECT count(*)
>> FROM news_feed
>> WHERE data @? '$.content ? (@ like_regex "^Start")';
>>
>> SELECT id, jsonb_path_query(data, '$.content')
>> FROM news_feed
>> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
>>
>> I would find that the index scan performed as well as the sequential
>> scan. Additionally, on my laptop, the parallel sequential scan would
>> beat the index scan by ~2.5x in some cases.
>
> Yeah, this cases are not supported.  Did optimizer automatically
> select sequential scan in this case (if not touching enable_*
> variables)?  It should, because optimizer understands that GIN scan
> will be bad if extract_query method failed to extract anything.

It did not - it was doing a bitmap heap scan. I have default costs
setup. Example output from EXPLAIN ANALYZE with the index available:

 Aggregate  (cost=1539.78..1539.79 rows=1 width=8) (actual
time=270.419..270.419 rows=1 loops=1)
   ->  Bitmap Heap Scan on news_feed  (cost=23.24..1538.73 rows=418
width=0) (actual time=84.040..270.407 rows=5 loops=1)
         Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
         Rows Removed by Index Recheck: 418360
         Heap Blocks: exact=28690
         ->  Bitmap Index Scan on news_feed_data_gin_idx
(cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788
rows=418365 loops=1)
               Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
 Planning Time: 0.168 ms
 Execution Time: 271.105 ms

And for arguments sake, after I dropped the index (and
max_parallel_workers = 8):

 Finalize Aggregate  (cost=30998.07..30998.08 rows=1 width=8) (actual
time=91.062..91.062 rows=1 loops=1)
   ->  Gather  (cost=30997.65..30998.06 rows=4 width=8) (actual
time=90.892..97.739 rows=5 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate  (cost=29997.65..29997.66 rows=1 width=8)
(actual time=76.977..76.977 rows=1 loops=5)
               ->  Parallel Seq Scan on news_feed  (cost=0.00..29997.39
rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5)
                     Filter: (data @? '$."length"?(@ < 150)'::jsonpath)
                     Rows Removed by Filter: 83672
 Planning Time: 0.127 ms
 Execution Time: 97.801 ms


>> Reading up on what the GIN patch does, this all makes sense: it's
>> optimized for equality, I understand there are challenges to be able to
>> handle inequality, regex exps, etc. And the cases where it really does
>> work well, it's _incredibly_ fast.
>
> Yes, for more complex cases, we need different opclasses.  For
> instance, we can consider porting jsquery opclasses to PG 13.  And it
> become even more important to get parametrized opclasses, because we
> don't necessary want to index all the json fields in this same way.
> That's another challenge for future releases.  But what we have now is
> just support for some of jsonpathes for existing opclasses.

Yeah, that makes sense, and seems to be my recollection from the several
years of presentations I've seen on the topic ;)

>> My suggestion would be adding some additional guidance in the user
>> documentation around how GIN works with the @@ and @? operators so they
>> can understand where GIN will work very well with JSON path + their data
>> and not be surprised when other types of JSON path queries are
>> performing on par with a sequential scan (or worse than a parallel seq
>> scan).
>
> Good point.  Will do.

Thanks!

Jonathan


Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 17, 2019 at 8:00 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> On 3/17/19 12:55 PM, Alexander Korotkov wrote:
> >
> >> However, when I did something a little more complex, like the below:
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.length ? (@ < 150)';
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "^Start")';
> >>
> >> SELECT id, jsonb_path_query(data, '$.content')
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
> >>
> >> I would find that the index scan performed as well as the sequential
> >> scan. Additionally, on my laptop, the parallel sequential scan would
> >> beat the index scan by ~2.5x in some cases.
> >
> > Yeah, this cases are not supported.  Did optimizer automatically
> > select sequential scan in this case (if not touching enable_*
> > variables)?  It should, because optimizer understands that GIN scan
> > will be bad if extract_query method failed to extract anything.
>
> It did not - it was doing a bitmap heap scan. I have default costs
> setup. Example output from EXPLAIN ANALYZE with the index available:
>
>  Aggregate  (cost=1539.78..1539.79 rows=1 width=8) (actual
> time=270.419..270.419 rows=1 loops=1)
>    ->  Bitmap Heap Scan on news_feed  (cost=23.24..1538.73 rows=418
> width=0) (actual time=84.040..270.407 rows=5 loops=1)
>          Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>          Rows Removed by Index Recheck: 418360
>          Heap Blocks: exact=28690
>          ->  Bitmap Index Scan on news_feed_data_gin_idx
> (cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788
> rows=418365 loops=1)
>                Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>  Planning Time: 0.168 ms
>  Execution Time: 271.105 ms
>
> And for arguments sake, after I dropped the index (and
> max_parallel_workers = 8):
>
>  Finalize Aggregate  (cost=30998.07..30998.08 rows=1 width=8) (actual
> time=91.062..91.062 rows=1 loops=1)
>    ->  Gather  (cost=30997.65..30998.06 rows=4 width=8) (actual
> time=90.892..97.739 rows=5 loops=1)
>          Workers Planned: 4
>          Workers Launched: 4
>          ->  Partial Aggregate  (cost=29997.65..29997.66 rows=1 width=8)
> (actual time=76.977..76.977 rows=1 loops=5)
>                ->  Parallel Seq Scan on news_feed  (cost=0.00..29997.39
> rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5)
>                      Filter: (data @? '$."length"?(@ < 150)'::jsonpath)
>                      Rows Removed by Filter: 83672
>  Planning Time: 0.127 ms
>  Execution Time: 97.801 ms

Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
"Jonathan S. Katz"
Date:
On 3/17/19 1:02 PM, Alexander Korotkov wrote:
>
> Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

I just used "USING gin(col)" so jsonb_ops.

Thanks,

Jonathan


Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
> On 3/17/19 1:02 PM, Alexander Korotkov wrote:
> >
> > Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?
>
> I just used "USING gin(col)" so jsonb_ops.

I see.  So, jsonb_ops extracts from this query only existence of
.length key.  And I can bet it exists in all (or almost all) the
documents.  Thus, optimizer thinks index might be useful, while it's
useless.  There is not much can be done while we don't have statistics
for jsonb (and access to it from GIN extract_query).  So, for now we
can just refuse from extracting only keys from jsonpath in jsonb_ops.
But I think it would be better to just document this issue.  In future
we should improve that with statistics.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
"Jonathan S. Katz"
Date:
On 3/17/19 1:14 PM, Alexander Korotkov wrote:
> On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
>> On 3/17/19 1:02 PM, Alexander Korotkov wrote:
>>>
>>> Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?
>>
>> I just used "USING gin(col)" so jsonb_ops.
>
> I see.  So, jsonb_ops extracts from this query only existence of
> .length key.  And I can bet it exists in all (or almost all) the
> documents.  Thus, optimizer thinks index might be useful, while it's
> useless.  There is not much can be done while we don't have statistics
> for jsonb (and access to it from GIN extract_query).  So, for now we
> can just refuse from extracting only keys from jsonpath in jsonb_ops.
> But I think it would be better to just document this issue.  In future
> we should improve that with statistics.

That seems to make sense, especially given how I've typically stored
JSON documents in PostgreSQL. It sounds like this particular problem
would be solved appropriately with JSONB statistics.

Thanks,

Jonathan


Attachment

Re: jsonpath

From
Nikita Glukhov
Date:


On 17.03.2019 21:29, Jonathan S. Katz wrote:
On 3/17/19 1:14 PM, Alexander Korotkov wrote:
On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:
On 3/17/19 1:02 PM, Alexander Korotkov wrote:
Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?
I just used "USING gin(col)" so jsonb_ops.
I see.  So, jsonb_ops extracts from this query only existence of
.length key.  And I can bet it exists in all (or almost all) the
documents.  Thus, optimizer thinks index might be useful, while it's
useless.  There is not much can be done while we don't have statistics
for jsonb (and access to it from GIN extract_query).  So, for now we
can just refuse from extracting only keys from jsonpath in jsonb_ops.
But I think it would be better to just document this issue.  In future
we should improve that with statistics.
That seems to make sense, especially given how I've typically stored
JSON documents in PostgreSQL. It sounds like this particular problem
would be solved appropriately with JSONB statistics.
GIN jsonb_ops extracts from query
 data @? '$.length ? (@ < 150)'

the same GIN entries as from queries
 data @? '$.length' data ? 'length'


If you don't want to extract entries from unsupported expressions, you can try 
to use another jsonpath operator @@. Queries will also look like a bit simpler:
 data @@ '$.length < 150' data @@ '$.content like_regex "^Start"' data @@ '$.content like_regex "risk" flag "i"'

All this queries emit no GIN entries. But note that
 data @@ '$ ? (@.content == "foo").length < 150'

emits the same entries as
 data @@ '$.content == "foo"'



We already have a POC implementation of jsonb statistics that was written
2 years ago.  I rebased it onto the current master yesterday.  If it is
interesting, you can find it on my GitHub [1].  But note, that there is 
no support for  jsonpath operators yet, only boolean EXISTS ?, ?|, ?&, and
CONTAINS @> operators are supported.  Also there is no docs, and it works
slowly  (a more effective storage method for statistics of individual JSON
paths is needed).

Also there is ability to calculate derived statistics of expressions like
 js -> 'x' -> 0 -> 'y' js #> '{x,0,y}'

using jsonb statistics for columns "js". So the selectivity of expressions
 js -> 'x' -> 0 -> 'y' = '123' js #> '{x,0,y}' >= '123'

also can be estimated (but these expressions can't be used by index on "js").


This topic deserves a separate discussion.  I hope, we will start the 
corresponding thread for PG13 after we find a more effective way of jsonb 
statistics storing.


[1] https://github.com/glukhovn/postgres/tree/jsonb_stats
--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Tom Lane
Date:
Just another minor bitch about this patch: jsonpath_scan.l has introduced
a typedef called "keyword".  This is causing pgindent to produce seriously
ugly results in libpq, and probably in other places where that is used as
a field or variable name.  Please rename that typedef to something less
generic.

            regards, tom lane


Re: jsonpath

From
Alexander Korotkov
Date:
On Mon, Mar 18, 2019 at 10:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Just another minor bitch about this patch: jsonpath_scan.l has introduced
> a typedef called "keyword".  This is causing pgindent to produce seriously
> ugly results in libpq, and probably in other places where that is used as
> a field or variable name.  Please rename that typedef to something less
> generic.

Ooops... I propose to rename it to KeyWord, which is already
typedef'ed in formatting.c.  See the attached patch.  Is it OK?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Mar 18, 2019 at 10:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Just another minor bitch about this patch: jsonpath_scan.l has introduced
>> a typedef called "keyword".  This is causing pgindent to produce seriously
>> ugly results in libpq, and probably in other places where that is used as
>> a field or variable name.  Please rename that typedef to something less
>> generic.

> Ooops... I propose to rename it to KeyWord, which is already
> typedef'ed in formatting.c.  See the attached patch.  Is it OK?

I had in mind JsonPathKeyword or something like that.  If you re-use
formatting.c's typedef name, pgindent won't care, but it's possible
you'd be in for unhappiness when trying to look at these structs in
gdb for instance.

            regards, tom lane


Re: jsonpath

From
Pavel Stehule
Date:


po 18. 3. 2019 v 21:23 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> On Mon, Mar 18, 2019 at 10:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Just another minor bitch about this patch: jsonpath_scan.l has introduced
>> a typedef called "keyword".  This is causing pgindent to produce seriously
>> ugly results in libpq, and probably in other places where that is used as
>> a field or variable name.  Please rename that typedef to something less
>> generic.

> Ooops... I propose to rename it to KeyWord, which is already
> typedef'ed in formatting.c.  See the attached patch.  Is it OK?

I had in mind JsonPathKeyword or something like that.  If you re-use
formatting.c's typedef name, pgindent won't care, but it's possible
you'd be in for unhappiness when trying to look at these structs in
gdb for instance.

JsonPathKeyword is better verbose

Pavel


                        regards, tom lane

Re: jsonpath

From
John Naylor
Date:
On Sun, Feb 24, 2019 at 5:03 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> > Why -CF, and why is -p repeated?
>
> BTW, for our SQL grammar we have
>
> > scan.c: FLEXFLAGS = -CF -p -p
>
> Is it kind of default?

I just saw this in the committed patch. This is not default, it's
chosen for maximum performance at the expense of binary/memory size.
That's fine, but with a little effort you can also make the scanner
non-backtracking for additional performance, as in the attached.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Mon, Mar 18, 2019 at 11:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > On Mon, Mar 18, 2019 at 10:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Just another minor bitch about this patch: jsonpath_scan.l has introduced
> >> a typedef called "keyword".  This is causing pgindent to produce seriously
> >> ugly results in libpq, and probably in other places where that is used as
> >> a field or variable name.  Please rename that typedef to something less
> >> generic.
>
> > Ooops... I propose to rename it to KeyWord, which is already
> > typedef'ed in formatting.c.  See the attached patch.  Is it OK?
>
> I had in mind JsonPathKeyword or something like that.  If you re-use
> formatting.c's typedef name, pgindent won't care, but it's possible
> you'd be in for unhappiness when trying to look at these structs in
> gdb for instance.

Good point, thanks!  Pushed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Mar 19, 2019 at 12:23 PM John Naylor
<john.naylor@2ndquadrant.com> wrote:
> On Sun, Feb 24, 2019 at 5:03 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres@anarazel.de> wrote:
> > > Why -CF, and why is -p repeated?
> >
> > BTW, for our SQL grammar we have
> >
> > > scan.c: FLEXFLAGS = -CF -p -p
> >
> > Is it kind of default?
>
> I just saw this in the committed patch. This is not default, it's
> chosen for maximum performance at the expense of binary/memory size.
> That's fine, but with a little effort you can also make the scanner
> non-backtracking for additional performance, as in the attached.

We're working on patchset improving flex scanner.  Already have
similar change among the others.  But thanks a lot for your attention!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 17, 2019 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> > Why are we installing the jsonpath_gram.h file? It's not going to be
> > used by anything else, is it? TBH, I'm not sure I see why we're
> > installing the scanner.h file either.
>
> As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> we'd be better off handling that need by compiling the .l file as part
> of the .y file, as we used to do with the core lexer and still do with
> several others (cf comments for commit 72b1e3a21); then we wouldn't
> even have to generate these files much less install them.
>
> A quick look at jsonpath_scan.c shows that it's pretty innocent of
> the tricks we've learned over the years for flex/bison portability;
> in particular I see that it's #including <stdio.h> before postgres.h,
> which is a no-no.  So that whole area needs more review anyway.

Attached patch is getting rid of jsonpath_gram.h.  Would like to see
results of http://commitfest.cputube.org/ before committing, because
I'm not available to test this of windows machine.  There would be
further patches rearranging jsonpath_gram.y and jsonpath_scan.l.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Mar 19, 2019 at 8:10 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sun, Mar 17, 2019 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> > > Why are we installing the jsonpath_gram.h file? It's not going to be
> > > used by anything else, is it? TBH, I'm not sure I see why we're
> > > installing the scanner.h file either.
> >
> > As near as I can see, jsonpath_gram.h and jsonpath_scanner.h exist only
> > for communication between jsonpath_gram.y and jsonpath_scan.l.  Maybe
> > we'd be better off handling that need by compiling the .l file as part
> > of the .y file, as we used to do with the core lexer and still do with
> > several others (cf comments for commit 72b1e3a21); then we wouldn't
> > even have to generate these files much less install them.
> >
> > A quick look at jsonpath_scan.c shows that it's pretty innocent of
> > the tricks we've learned over the years for flex/bison portability;
> > in particular I see that it's #including <stdio.h> before postgres.h,
> > which is a no-no.  So that whole area needs more review anyway.
>
> Attached patch is getting rid of jsonpath_gram.h.  Would like to see
> results of http://commitfest.cputube.org/ before committing, because
> I'm not available to test this of windows machine.  There would be
> further patches rearranging jsonpath_gram.y and jsonpath_scan.l.

Attaches patches improving jsonpath parser.  First one introduces
cosmetic changes, while second gets rid of backtracking.  I'm also
planning to add high-level comment for both grammar and lexer.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Nikita Glukhov
Date:
Hi.

Attached patch enables throwing of errors in jsonb_path_match() in its 
non-silent mode when the jsonpath expression failed to return a singleton
boolean.  Previously, NULL was always returned, and it seemed to be 
inconsistent with the behavior of other functions, in which structural and 
other errors were not suppressed in non-silent mode.


We also think that jsonb_path_match() needs to be renamed, because its 
current name is confusing to many users.  "Match" name is more suitable for
jsonpath-based pattern matching like that (maybe we'll implement it later):
 jsonb_path_match(   '{ "a": 1, "b": 2,    "c": "str" }',   '{ "a": 1, "b": @ > 1, * : @.type == "string" }' )

Below are some possible name variants:
 jsonb_path_predicate() (original name) jsonb_path_pred() jsonb_path_test() jsonb_path_check() jsonb_path_bool()

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: jsonpath

From
John Naylor
Date:
On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Attaches patches improving jsonpath parser.  First one introduces
> cosmetic changes, while second gets rid of backtracking.  I'm also
> planning to add high-level comment for both grammar and lexer.

The cosmetic changes look good to me. I just noticed a couple things
about the comments.

0001:

+/* Check if current scanstring value constitutes a keyword */

'is a keyword' is better. 'Constitutes' implies parts of a whole.

+ * Resize scanstring for appending of given length.  Reinitilize if required.

s/Reinitilize/Reinitialize/

The first sentence is not entirely clear to me.


0002:

These two rules are not strictly necessary:

<xnq,xq,xvq,xsq>{unicode}+\\ {
/* throw back the \\, and treat as unicode */
yyless(yyleng - 1);
parseUnicode(yytext, yyleng);
}

<xnq,xq,xvq,xsq>{hex_char}+\\ {
/* throw back the \\, and treat as hex */
yyless(yyleng - 1);
parseHexChars(yytext, yyleng);
}

...and only seem to be there because of how these are written:

<xnq,xq,xvq,xsq>{unicode}+ { parseUnicode(yytext, yyleng); }
<xnq,xq,xvq,xsq>{hex_char}+ { parseHexChars(yytext, yyleng); }
<xnq,xq,xvq,xsq>{unicode}*{unicodefail} { yyerror(NULL, "Unicode
sequence is invalid"); }
<xnq,xq,xvq,xsq>{hex_char}*{hex_fail} { yyerror(NULL, "Hex character
sequence is invalid"); }

I don't understand the reasoning here -- is it a micro-optimization?
The following is simpler, allow the rules I mentioned to be removed,
and make check still passes. I would prefer it unless there is a
performance penalty, in which case a comment to describe the
additional complexity would be helpful.

<xnq,xq,xvq,xsq>{unicode} { parseUnicode(yytext, yyleng); }
<xnq,xq,xvq,xsq>{hex_char} { parseHexChars(yytext, yyleng); }
<xnq,xq,xvq,xsq>{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); }
<xnq,xq,xvq,xsq>{hex_fail} { yyerror(NULL, "Hex character sequence is
invalid"); }

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: jsonpath

From
Oleg Bartunov
Date:


On Fri, 22 Mar 2019, 03:14 Nikita Glukhov, <n.gluhov@postgrespro.ru> wrote:
Hi.

Attached patch enables throwing of errors in jsonb_path_match() in its 
non-silent mode when the jsonpath expression failed to return a singleton
boolean.  Previously, NULL was always returned, and it seemed to be 
inconsistent with the behavior of other functions, in which structural and 
other errors were not suppressed in non-silent mode.


We also think that jsonb_path_match() needs to be renamed, because its 
current name is confusing to many users.  "Match" name is more suitable for
jsonpath-based pattern matching like that (maybe we'll implement it later):
 jsonb_path_match(   '{ "a": 1, "b": 2,    "c": "str" }',   '{ "a": 1, "b": @ > 1, * : @.type == "string" }' )
Would be very useful for constraints.

Below are some possible name variants:
 jsonb_path_predicate() (original name)
Too long

  jsonb_path_pred() jsonb_path_test() jsonb_path_check()

Check is good to me
 jsonb_path_bool()

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

From
Nikita Glukhov
Date:

On 21.03.2019 16:58, Alexander Korotkov wrote:

On Tue, Mar 19, 2019 at 8:10 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
Attaches patches improving jsonpath parser.  First one introduces
cosmetic changes, while second gets rid of backtracking.  I'm also
planning to add high-level comment for both grammar and lexer.
Parsing of integers now is wrong: neither JSON specification, nor our json[b]
allow leading zeros in integers and floats starting with a dot.

=# SELECT json '.1';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '.1';                    ^
DETAIL:  Token "." is invalid.
CONTEXT:  JSON data, line 1: ....

=# SELECT json '00';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '00';                    ^
DETAIL:  Token "00" is invalid.
CONTEXT:  JSON data, line 1: 00

=# SELECT json '00.1';
ERROR:  invalid input syntax for type json
LINE 1: SELECT jsonb '00.1';                    ^
DETAIL:  Token "00" is invalid.
CONTEXT:  JSON data, line 1: 00...


In JavaScript integers with leading zero are treated as octal numbers,
but leading dot is a allowed:

v8 > 0123
83
v8 > 000.1
Uncaught SyntaxError: Unexpected number
v8> .1
0.1

Attached patch 0003 fixes this issue.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, Mar 22, 2019 at 5:38 AM John Naylor <john.naylor@2ndquadrant.com> wrote:
> On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Attaches patches improving jsonpath parser.  First one introduces
> > cosmetic changes, while second gets rid of backtracking.  I'm also
> > planning to add high-level comment for both grammar and lexer.
>
> The cosmetic changes look good to me. I just noticed a couple things
> about the comments.
>
> 0001:
>
> +/* Check if current scanstring value constitutes a keyword */
>
> 'is a keyword' is better. 'Constitutes' implies parts of a whole.
>
> + * Resize scanstring for appending of given length.  Reinitilize if required.
>
> s/Reinitilize/Reinitialize/
>
> The first sentence is not entirely clear to me.

Thank you, fixed.

> 0002:
>
> These two rules are not strictly necessary:
>
> <xnq,xq,xvq,xsq>{unicode}+\\ {
> /* throw back the \\, and treat as unicode */
> yyless(yyleng - 1);
> parseUnicode(yytext, yyleng);
> }
>
> <xnq,xq,xvq,xsq>{hex_char}+\\ {
> /* throw back the \\, and treat as hex */
> yyless(yyleng - 1);
> parseHexChars(yytext, yyleng);
> }
>
> ...and only seem to be there because of how these are written:
>
> <xnq,xq,xvq,xsq>{unicode}+ { parseUnicode(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{hex_char}+ { parseHexChars(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{unicode}*{unicodefail} { yyerror(NULL, "Unicode
> sequence is invalid"); }
> <xnq,xq,xvq,xsq>{hex_char}*{hex_fail} { yyerror(NULL, "Hex character
> sequence is invalid"); }
>
> I don't understand the reasoning here -- is it a micro-optimization?
> The following is simpler, allow the rules I mentioned to be removed,
> and make check still passes. I would prefer it unless there is a
> performance penalty, in which case a comment to describe the
> additional complexity would be helpful.
>
> <xnq,xq,xvq,xsq>{unicode} { parseUnicode(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{hex_char} { parseHexChars(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); }
> <xnq,xq,xvq,xsq>{hex_fail} { yyerror(NULL, "Hex character sequence is
> invalid"); }

These rules are needed for unicode.  Sequential escaped unicode
characters might be connected by hi surrogate value.  See
jsonpath_encoding regression test in attached patch.

Regarding hex, I made it so for the sake of uniformity.  But I changed
my mind and decided that simpler flex rules are more important.  So,
now they are considered one-by-one.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Andres Freund
Date:
Hi,

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: select '$.g ? (@.a == 1 || !(@.x >= 123 ||
@.a== 4) && @.b == 7)'::jsonpath;
 
2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  statement: select jsonb_path_query('1', 'lax
$.a');
2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was terminated by signal 11: Segmentation
fault
2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: select '$.g ? (@.a == 1 || !(@.x >= 123 ||
@.a== 4) && @.b == 7)'::jsonpath;
 
2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
active server processes

Something's not quite right... Note this appears to be 32bit sparc.

- Andres


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 24, 2019 at 7:45 PM Andres Freund <andres@anarazel.de> wrote:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28
>
> 2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: select '$.g ? (@.a == 1 || !(@.x >= 123
||@.a == 4) && @.b == 7)'::jsonpath;
 
> 2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  statement: select jsonb_path_query('1', 'lax
$.a');
> 2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was terminated by signal 11: Segmentation
fault
> 2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: select '$.g ? (@.a == 1 || !(@.x >= 123
||@.a == 4) && @.b == 7)'::jsonpath;
 
> 2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
> active server processes
>
> Something's not quite right... Note this appears to be 32bit sparc.

Thank you for pointing.  Will investigate.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Mar 24, 2019 at 9:09 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sun, Mar 24, 2019 at 7:45 PM Andres Freund <andres@anarazel.de> wrote:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28
> >
> > 2019-03-23 14:28:31.147 CET [18056:45] pg_regress/jsonpath LOG:  statement: select '$.g ? (@.a == 1 || !(@.x >= 123
||@.a == 4) && @.b == 7)'::jsonpath;
 
> > 2019-03-23 14:28:31.157 CET [18055:59] pg_regress/jsonb_jsonpath LOG:  statement: select jsonb_path_query('1', 'lax
$.a');
> > 2019-03-23 14:28:31.163 CET [9597:311] LOG:  server process (PID 18056) was terminated by signal 11: Segmentation
fault
> > 2019-03-23 14:28:31.163 CET [9597:312] DETAIL:  Failed process was running: select '$.g ? (@.a == 1 || !(@.x >= 123
||@.a == 4) && @.b == 7)'::jsonpath;
 
> > 2019-03-23 14:28:31.164 CET [9597:313] LOG:  terminating any other
> > active server processes
> >
> > Something's not quite right... Note this appears to be 32bit sparc.
>
> Thank you for pointing.  Will investigate.

Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
running check-world in a loop on the same commit hash with same build
options.  Error wasn't triggered yet.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Mar 26, 2019 at 6:06 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Tue, Mar 26, 2019 at 5:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > > Got access to that buildfarm animal thanks to Tom Turelinckx.  Now
> > > running check-world in a loop on the same commit hash with same build
> > > options.  Error wasn't triggered yet.
> >
> > I notice that snapper is using force_parallel_mode = regress ...
> > have you got that enabled in your manual test?
>
> Nope.  Thank you for pointing!  I've rerun my test loop with this...

Still no reproduction.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Still no reproduction.

Annoying, but it's probably not worth expending more effort on
right now.  I wonder whether that buildfarm animal can be upgraded
to capture core dump stack traces --- if so, then if it happens
again we'd have more info.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Mar 27, 2019 at 4:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > Still no reproduction.
>
> Annoying, but it's probably not worth expending more effort on
> right now.  I wonder whether that buildfarm animal can be upgraded
> to capture core dump stack traces --- if so, then if it happens
> again we'd have more info.

Hopefully, Andrew will manage to get a backtrace [1].

BTW, while searching for this bug, I've collected this backtrace using valgrind.

==00:00:00:14.596 10866== Conditional jump or move depends on
uninitialised value(s)
==00:00:00:14.596 10866==    at 0x579F8C4: ____strtod_l_internal (in
/usr/lib64/libc-2.17.so)
==00:00:00:14.596 10866==    by 0x771561: float8in_internal_opt_error
(float.c:394)
==00:00:00:14.596 10866==    by 0x7718B9: float8in_internal (float.c:515)
==00:00:00:14.596 10866==    by 0x7718B9: float8in (float.c:336)
==00:00:00:14.596 10866==    by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==    by 0x7C9649: numeric_float8 (numeric.c:3417)
==00:00:00:14.596 10866==    by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
==00:00:00:14.596 10866==    by 0x7A1D8D: jsonb_float8 (jsonb.c:2058)
==00:00:00:14.596 10866==    by 0x5F8F54: ExecInterpExpr (execExprInterp.c:649)
==00:00:00:14.596 10866==    by 0x6A2E19: ExecEvalExprSwitchContext
(executor.h:307)
==00:00:00:14.596 10866==    by 0x6A2E19: evaluate_expr (clauses.c:4827)
==00:00:00:14.596 10866==    by 0x6A45FF: evaluate_function (clauses.c:4369)
==00:00:00:14.596 10866==    by 0x6A45FF: simplify_function (clauses.c:3999)
==00:00:00:14.596 10866==    by 0x6A31C1:
eval_const_expressions_mutator (clauses.c:2474)
==00:00:00:14.596 10866==    by 0x644466: expression_tree_mutator
(nodeFuncs.c:3072)
==00:00:00:14.596 10866==
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:____strtod_l_internal
   fun:float8in_internal_opt_error
   fun:float8in_internal
   fun:float8in
   fun:DirectFunctionCall1Coll
   fun:numeric_float8
   fun:DirectFunctionCall1Coll
   fun:jsonb_float8
   fun:ExecInterpExpr
   fun:ExecEvalExprSwitchContext
   fun:evaluate_expr
   fun:evaluate_function
   fun:simplify_function
   fun:eval_const_expressions_mutator
   fun:expression_tree_mutator
}

Not sure whether it's related to 16d489b0fe.  Will investigate more.

1. https://www.postgresql.org/message-id/91ff584f-75d2-471f-4d9e-9ee8f09cdc1d%402ndQuadrant.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Tomas Vondra
Date:
On Wed, Mar 27, 2019 at 05:37:40PM +0300, Alexander Korotkov wrote:
>On Wed, Mar 27, 2019 at 4:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> > Still no reproduction.
>>
>> Annoying, but it's probably not worth expending more effort on
>> right now.  I wonder whether that buildfarm animal can be upgraded
>> to capture core dump stack traces --- if so, then if it happens
>> again we'd have more info.
>
>Hopefully, Andrew will manage to get a backtrace [1].
>
>BTW, while searching for this bug, I've collected this backtrace using valgrind.
>
>==00:00:00:14.596 10866== Conditional jump or move depends on
>uninitialised value(s)
>==00:00:00:14.596 10866==    at 0x579F8C4: ____strtod_l_internal (in
>/usr/lib64/libc-2.17.so)
>==00:00:00:14.596 10866==    by 0x771561: float8in_internal_opt_error
>(float.c:394)
>==00:00:00:14.596 10866==    by 0x7718B9: float8in_internal (float.c:515)
>==00:00:00:14.596 10866==    by 0x7718B9: float8in (float.c:336)
>==00:00:00:14.596 10866==    by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
>==00:00:00:14.596 10866==    by 0x7C9649: numeric_float8 (numeric.c:3417)
>==00:00:00:14.596 10866==    by 0x842D43: DirectFunctionCall1Coll (fmgr.c:803)
>==00:00:00:14.596 10866==    by 0x7A1D8D: jsonb_float8 (jsonb.c:2058)
>==00:00:00:14.596 10866==    by 0x5F8F54: ExecInterpExpr (execExprInterp.c:649)
>==00:00:00:14.596 10866==    by 0x6A2E19: ExecEvalExprSwitchContext
>(executor.h:307)
>==00:00:00:14.596 10866==    by 0x6A2E19: evaluate_expr (clauses.c:4827)
>==00:00:00:14.596 10866==    by 0x6A45FF: evaluate_function (clauses.c:4369)
>==00:00:00:14.596 10866==    by 0x6A45FF: simplify_function (clauses.c:3999)
>==00:00:00:14.596 10866==    by 0x6A31C1:
>eval_const_expressions_mutator (clauses.c:2474)
>==00:00:00:14.596 10866==    by 0x644466: expression_tree_mutator
>(nodeFuncs.c:3072)
>==00:00:00:14.596 10866==
>{
>   <insert_a_suppression_name_here>
>   Memcheck:Cond
>   fun:____strtod_l_internal
>   fun:float8in_internal_opt_error
>   fun:float8in_internal
>   fun:float8in
>   fun:DirectFunctionCall1Coll
>   fun:numeric_float8
>   fun:DirectFunctionCall1Coll
>   fun:jsonb_float8
>   fun:ExecInterpExpr
>   fun:ExecEvalExprSwitchContext
>   fun:evaluate_expr
>   fun:evaluate_function
>   fun:simplify_function
>   fun:eval_const_expressions_mutator
>   fun:expression_tree_mutator
>}
>
>Not sure whether it's related to 16d489b0fe.  Will investigate more.
>

This might be another case of false positive due to SSE (which I think is
used by strtod in some cases). But I'd expect strncasecmp in the stack in
that case, so maybe that's not it.


cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Andrew Dunstan
Date:
On 3/27/19 9:48 AM, Tom Lane wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> Still no reproduction.
> Annoying, but it's probably not worth expending more effort on
> right now.  I wonder whether that buildfarm animal can be upgraded
> to capture core dump stack traces --- if so, then if it happens
> again we'd have more info.
>
>             



I was able to get this stack trace.


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

Re: jsonpath

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> I was able to get this stack trace.
>
> (gdb) bt
> #0  0x00007ffb9ce6a458 in ntdll!RtlRaiseStatus ()
>    from C:\WINDOWS\SYSTEM32\ntdll.dll
> #1  0x00007ffb9ce7760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
> #2  0x00007ffb9ac52e1a in msvcrt!_setjmpex ()
>    from C:\WINDOWS\System32\msvcrt.dll
> #3  0x000000000087431a in pg_re_throw ()
>     at c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:1720
> #4  0x0000000000874106 in errfinish (dummy=<optimized out>)
>     at c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:464
> #5  0x00000000007cc938 in jsonpath_yyerror (result=result@entry=0x0,
>     message=message@entry=0xab0868 <__func__.110231+1592> "unrecognized flag of LIKE_REGEX predicate")
>     at /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:305
> #6  0x00000000007cec9d in makeItemLikeRegex (pattern=<optimized out>,
>     pattern=<optimized out>, flags=<optimized out>, expr=0x73c7a80)
>     at /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:512

Hmm.  Reaching the yyerror call is expected given this input, but
seemingly the siglongjmp stack has been trashed?  The best I can
think of is a wild store that either occurs only on this platform
or happens to be harmless elsewhere ... but neither idea is terribly
convincing.

BTW, the expected behavior according to the regression test is

regression=# select '$ ? (@ like_regex "pattern" flag "a"'::jsonpath;
ERROR:  bad jsonpath representation
LINE 1: select '$ ? (@ like_regex "pattern" flag "a"'::jsonpath;
               ^
DETAIL:  unrecognized flag of LIKE_REGEX predicate at or near """

which leaves quite a lot to be desired already.  The "bad whatever"
error wording is a flat out violation of our message style guide,
while the "at or near """" bit is pretty darn unhelpful.

The latter problem occurs because the last flex production was

<xq>\"                          {
                                    yylval->str = scanstring;
                                    BEGIN INITIAL;
                                    return STRING_P;
                                }

so that flex thinks the last token was just the quote mark ending the
string.  This could be improved on by adopting something similar to the
SET_YYLLOC() convention used in scan.l to remember the start of what the
user would think the token is.  Probably it's not worth the work right
now, but details like this are important from a fit-and-finish
perspective, so I'd like to see it improved sometime.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
> On 3/27/19 9:48 AM, Tom Lane wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> >> Still no reproduction.
> > Annoying, but it's probably not worth expending more effort on
> > right now.  I wonder whether that buildfarm animal can be upgraded
> > to capture core dump stack traces --- if so, then if it happens
> > again we'd have more info.
>
> I was able to get this stack trace.

Thank you very much!  It appears to be hard to investigate this even
with backtrace.  You told you can almost reliably reproduce this.
Could you please, find exact commit caused this error using "git
bisect"?  I would very appreciate this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Andrew Dunstan
Date:
On 3/28/19 1:01 AM, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I was able to get this stack trace.
>>
>> (gdb) bt
>> #0  0x00007ffb9ce6a458 in ntdll!RtlRaiseStatus ()
>>    from C:\WINDOWS\SYSTEM32\ntdll.dll
>> #1  0x00007ffb9ce7760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
>> #2  0x00007ffb9ac52e1a in msvcrt!_setjmpex ()
>>    from C:\WINDOWS\System32\msvcrt.dll
>> #3  0x000000000087431a in pg_re_throw ()
>>     at c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:1720
>> #4  0x0000000000874106 in errfinish (dummy=<optimized out>)
>>     at c:/MinGW/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/error/elog.c:464
>> #5  0x00000000007cc938 in jsonpath_yyerror (result=result@entry=0x0, 
>>     message=message@entry=0xab0868 <__func__.110231+1592> "unrecognized flag of LIKE_REGEX predicate")
>>     at /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_scan.l:305
>> #6  0x00000000007cec9d in makeItemLikeRegex (pattern=<optimized out>, 
>>     pattern=<optimized out>, flags=<optimized out>, expr=0x73c7a80)
>>     at /home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonpath_gram.y:512
> Hmm.  Reaching the yyerror call is expected given this input, but
> seemingly the siglongjmp stack has been trashed?  The best I can
> think of is a wild store that either occurs only on this platform
> or happens to be harmless elsewhere ... but neither idea is terribly
> convincing.



Further data point: if I just call the offending statement alone,
there's no crash. The crash only occurs when I call the whole script.
I'll see if I can triangulate a bit to get a minimal test case.


cheers


andrew



-- 

Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Andrew Dunstan
Date:
On 3/28/19 5:38 AM, Alexander Korotkov wrote:
> On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com> wrote:
>> On 3/27/19 9:48 AM, Tom Lane wrote:
>>> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>>>> Still no reproduction.
>>> Annoying, but it's probably not worth expending more effort on
>>> right now.  I wonder whether that buildfarm animal can be upgraded
>>> to capture core dump stack traces --- if so, then if it happens
>>> again we'd have more info.
>> I was able to get this stack trace.
> Thank you very much!  It appears to be hard to investigate this even
> with backtrace.  You told you can almost reliably reproduce this.
> Could you please, find exact commit caused this error using "git
> bisect"?  I would very appreciate this.
>


I'll try. It's time consuming given how long builds take.


Here's an interesting data point. If I run the whole jsonpath.sql script
it crashes every time. If I run just the offending statement it crashes
exactly every other time. It looks like in that case something gets
clobbered and then cleared.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Mar 28, 2019 at 3:25 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
> On 3/28/19 5:38 AM, Alexander Korotkov wrote:
> > On Thu, Mar 28, 2019 at 5:55 AM Andrew Dunstan
> > <andrew.dunstan@2ndquadrant.com> wrote:
> >> On 3/27/19 9:48 AM, Tom Lane wrote:
> >>> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> >>>> Still no reproduction.
> >>> Annoying, but it's probably not worth expending more effort on
> >>> right now.  I wonder whether that buildfarm animal can be upgraded
> >>> to capture core dump stack traces --- if so, then if it happens
> >>> again we'd have more info.
> >> I was able to get this stack trace.
> > Thank you very much!  It appears to be hard to investigate this even
> > with backtrace.  You told you can almost reliably reproduce this.
> > Could you please, find exact commit caused this error using "git
> > bisect"?  I would very appreciate this.
>
> I'll try. It's time consuming given how long builds take.

Thank you very much for your efforts!

> Here's an interesting data point. If I run the whole jsonpath.sql script
> it crashes every time. If I run just the offending statement it crashes
> exactly every other time. It looks like in that case something gets
> clobbered and then cleared.

Could you clarify this a bit?  What is exactly every other time?  Do
you mean it doesn't crash first time, but crashes second time?  Do you
run each offending statement in the separate session?  Or do you run
multiple offending statements in the same session?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Andrew Dunstan
Date:
On 3/28/19 8:49 AM, Alexander Korotkov wrote:
>
>> Here's an interesting data point. If I run the whole jsonpath.sql script
>> it crashes every time. If I run just the offending statement it crashes
>> exactly every other time. It looks like in that case something gets
>> clobbered and then cleared.
> Could you clarify this a bit?  What is exactly every other time?  Do
> you mean it doesn't crash first time, but crashes second time?  Do you
> run each offending statement in the separate session?  Or do you run
> multiple offending statements in the same session?
>

I mean repeated invocations of


    psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"


These get crash, no crash, crash, no crash ...



cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> I mean repeated invocations of
>     psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"
> These get crash, no crash, crash, no crash ...

That is just wacko ... but it does seem to support my thought of
a wild store somewhere.  The mechanism for this case might be
that memory layout is different depending on whether we had to
rebuild the relcache init file at session start or not.  Similarly,
the fact that the full test script reliably crashes might be
dependent on previous commands having left things in a certain
state.  Unfortunately that gets us little closer to a fix.

Has anybody gotten through a valgrind run on this code yet?

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Mar 28, 2019 at 4:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> > I mean repeated invocations of
> >     psql -c "select '$ ? (@ like_regex \"pattern\" flag \"a\")'::jsonpath"
> > These get crash, no crash, crash, no crash ...
>
> That is just wacko ... but it does seem to support my thought of
> a wild store somewhere.  The mechanism for this case might be
> that memory layout is different depending on whether we had to
> rebuild the relcache init file at session start or not.  Similarly,
> the fact that the full test script reliably crashes might be
> dependent on previous commands having left things in a certain
> state.  Unfortunately that gets us little closer to a fix.
>
> Has anybody gotten through a valgrind run on this code yet?

I've [1].  Find single backtrace, but it doesn't seem to be related to
our issue.

1. https://www.postgresql.org/message-id/CAPpHfdsgyPKbaqOsJ4tyC97Ybpm69eGLHzBGLKYXsfJi%3Dc-fjA%40mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Andres Freund
Date:

On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I mean repeated invocations of
>>     psql -c "select '$ ? (@ like_regex \"pattern\" flag
>\"a\")'::jsonpath"
>> These get crash, no crash, crash, no crash ...
>
>That is just wacko ... but it does seem to support my thought of
>a wild store somewhere.  The mechanism for this case might be
>that memory layout is different depending on whether we had to
>rebuild the relcache init file at session start or not.  Similarly,
>the fact that the full test script reliably crashes might be
>dependent on previous commands having left things in a certain
>state.  Unfortunately that gets us little closer to a fix.
>
>Has anybody gotten through a valgrind run on this code yet?

Skink has successfully passed since - but that's x86...
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: jsonpath

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Has anybody gotten through a valgrind run on this code yet?

> Skink has successfully passed since - but that's x86...

Yeah, there is a depressingly high chance that this is somehow specific
to the bison version, flex version, and/or compiler in use on jacana.

            regards, tom lane



Re: jsonpath

From
Andrew Dunstan
Date:
On 3/28/19 9:50 AM, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Has anybody gotten through a valgrind run on this code yet?
>> Skink has successfully passed since - but that's x86...
> Yeah, there is a depressingly high chance that this is somehow specific
> to the bison version, flex version, and/or compiler in use on jacana.
>
>             



lousyjack has also passed it (x64).


git bisect on jacana blames commit 550b9d26f.


cheers


andrew


-- 

Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Mar 28, 2019 at 7:43 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
> On 3/28/19 9:50 AM, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Has anybody gotten through a valgrind run on this code yet?
> >> Skink has successfully passed since - but that's x86...
> > Yeah, there is a depressingly high chance that this is somehow specific
> > to the bison version, flex version, and/or compiler in use on jacana.
>
> lousyjack has also passed it (x64).
>
> git bisect on jacana blames commit 550b9d26f.

Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
compile at once.  I've re-read this commit and didn't find anything
suspicious.
I've asked Andrew for access to jacana in order to investigate this myself.



------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, Mar 29, 2019 at 4:15 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Thu, Mar 28, 2019 at 7:43 PM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com> wrote:
> > On 3/28/19 9:50 AM, Tom Lane wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > >> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> Has anybody gotten through a valgrind run on this code yet?
> > >> Skink has successfully passed since - but that's x86...
> > > Yeah, there is a depressingly high chance that this is somehow specific
> > > to the bison version, flex version, and/or compiler in use on jacana.
> >
> > lousyjack has also passed it (x64).
> >
> > git bisect on jacana blames commit 550b9d26f.
>
> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

I'm going to push there 3 attached patches for jsonpath.

1st one is revised patch implementing GIN index support for jsonpath.
Based on feedback from Jonathan Katz, I decided to restrict jsonb_ops
from querying only case.  Now, jsonb_ops also works on if
"accessors_chain = const" statement is found.  Keys are frequently not
selective.  Given we have no statistics of them, purely key GIN
queries are likely confuse optimizer making it select inefficient
plan.  Actually, jsonb_ops may be used for pure key query when ?
operator is used.  But in this case, user explicitly searches for key.
With jsonpath, purely key GIN searches can easily happen unintended.
So, restrict that.

2nd and 3rd patches are from Nikita Glukhov upthread.  2nd restrict
some cases in parsing numerics.  3rd make jsonb_path_match() function
throw error when result is not single boolean and silent mode is off.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

RE: jsonpath

From
"Tom Turelinckx"
Date:
Alexander Korotkov wrote:

> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

Stack trace from skate:

[New LWP 6614]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: pgbf regression [local] SELECT                                      '.
Program terminated with signal 11, Segmentation fault.
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
34              ldx     [%o0], %o5
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
#1  0x0008a3e4 in printtup (slot=0x834888, self=0x864cc0) at printtup.c:435
#2  0x00259b60 in ExecutePlan (execute_once=<optimized out>, dest=0x864cc0, direction=<optimized out>, numberTuples=0,
    sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x833eb0, estate=0x833d70)
    at execMain.c:1686
#3  standard_ExecutorRun (queryDesc=0x7dcdc0, direction=<optimized out>, count=0, execute_once=<optimized out>)
    at execMain.c:365
#4  0x00259da0 in ExecutorRun (queryDesc=0x808920, queryDesc@entry=0x7dcdc0,
    direction=direction@entry=ForwardScanDirection, count=8801472, execute_once=<optimized out>) at execMain.c:309
#5  0x003e6714 in PortalRunSelect (portal=portal@entry=0x808920, forward=forward@entry=true, count=0,
    count@entry=2147483647, dest=dest@entry=0x864cc0) at pquery.c:929
#6  0x003e7d3c in PortalRun (portal=portal@entry=0x808920, count=count@entry=2147483647,
    isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x864cc0,
    altdest=altdest@entry=0x864cc0, completionTag=completionTag@entry=0xff86e830 "") at pquery.c:770
#7  0x003e32d0 in exec_simple_query (
    query_string=0x7ba400 "select '$.g ? (@.a == 1 || @.a == 4 && @.b == 7)'::jsonpath;") at postgres.c:1215
#8  0x003e4854 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x7e2370, dbname=0x7e2148 "regression",
    username=<optimized out>) at postgres.c:4247
#9  0x0007ae6c in BackendRun (port=0x7ddd40) at postmaster.c:4399
#10 BackendStartup (port=0x7ddd40) at postmaster.c:4090
#11 ServerLoop () at postmaster.c:1703
#12 0x00353a68 in PostmasterMain (argc=argc@entry=6, argv=argv@entry=0x7b49a8) at postmaster.c:1376
#13 0x0007cc60 in main (argc=6, argv=0x7b49a8) at main.c:228

Does this help?

Best regards,
Tom Turelinckx





RE: jsonpath

From
"Tom Turelinckx"
Date:
Alexander Korotkov wrote:

> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

Stack trace from skate:

[New LWP 6614]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: pgbf regression [local] SELECT                                      '.
Program terminated with signal 11, Segmentation fault.
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
34              ldx     [%o0], %o5
#0  strlen () at ../sysdeps/sparc/sparc64/strlen.S:34
#1  0x0008a3e4 in printtup (slot=0x834888, self=0x864cc0) at printtup.c:435
#2  0x00259b60 in ExecutePlan (execute_once=<optimized out>, dest=0x864cc0, direction=<optimized out>, numberTuples=0,
    sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x833eb0, estate=0x833d70)
    at execMain.c:1686
#3  standard_ExecutorRun (queryDesc=0x7dcdc0, direction=<optimized out>, count=0, execute_once=<optimized out>)
    at execMain.c:365
#4  0x00259da0 in ExecutorRun (queryDesc=0x808920, queryDesc@entry=0x7dcdc0,
    direction=direction@entry=ForwardScanDirection, count=8801472, execute_once=<optimized out>) at execMain.c:309
#5  0x003e6714 in PortalRunSelect (portal=portal@entry=0x808920, forward=forward@entry=true, count=0,
    count@entry=2147483647, dest=dest@entry=0x864cc0) at pquery.c:929
#6  0x003e7d3c in PortalRun (portal=portal@entry=0x808920, count=count@entry=2147483647,
    isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x864cc0,
    altdest=altdest@entry=0x864cc0, completionTag=completionTag@entry=0xff86e830 "") at pquery.c:770
#7  0x003e32d0 in exec_simple_query (
    query_string=0x7ba400 "select '$.g ? (@.a == 1 || @.a == 4 && @.b == 7)'::jsonpath;") at postgres.c:1215
#8  0x003e4854 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x7e2370, dbname=0x7e2148 "regression",
    username=<optimized out>) at postgres.c:4247
#9  0x0007ae6c in BackendRun (port=0x7ddd40) at postmaster.c:4399
#10 BackendStartup (port=0x7ddd40) at postmaster.c:4090
#11 ServerLoop () at postmaster.c:1703
#12 0x00353a68 in PostmasterMain (argc=argc@entry=6, argv=argv@entry=0x7b49a8) at postmaster.c:1376
#13 0x0007cc60 in main (argc=6, argv=0x7b49a8) at main.c:228

Does this help?

Best regards,
Tom Turelinckx





Re: jsonpath

From
Tom Lane
Date:
"Tom Turelinckx" <tom@turelinckx.be> writes:
> Stack trace from skate:

Huh ... so that's nowhere near the jsonpath-syntax-error crash that
we saw before.

I assume this trace is from this run?

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35

That looks a whole lot like the previous failure on snapper:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

right down to it having gotten through "make check" only to fail when the
same regression tests are run again during the pg_upgrade test.  I wonder
if there's any real significance to that, or if it's just that the failure
is not very repeatable.

BTW, what is the difference between skate and snapper?  They look to
be running on the same machine.

            regards, tom lane



RE: jsonpath

From
"Tom Turelinckx"
Date:
Tom Lane wrote:

> I assume this trace is from this run?
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35

Yes. I did get a core file now, but it wasn't picked up by the buildfarm
script, so I extracted the backtrace manually.

> That looks a whole lot like the previous failure on snapper:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

That's what I meant.

> Huh ... so that's nowhere near the jsonpath-syntax-error crash that
> we saw before.

Sorry, I wasn't aware there were multiple crashes.

> BTW, what is the difference between skate and snapper?  They look to
> be running on the same machine.

They are. Skate runs with default buildfarm options. Snapper mimics
the options used by the pgdg debian source packages. Both build the
same source (first skate then snapper). This to avoid a repeat of [1].
Snapper also runs more tests (UpgradeXversion, CollateLinuxUTF8).

Best regards,
Tom Turelinckx

1. https://www.postgresql.org/message-id/20160413175827.dmlbtdf7c3mgmnex%40alap3.anarazel.de





RE: jsonpath

From
"Tom Turelinckx"
Date:
Tom Lane wrote:

> I assume this trace is from this run?
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35

Yes. I did get a core file now, but it wasn't picked up by the buildfarm
script, so I extracted the backtrace manually.

> That looks a whole lot like the previous failure on snapper:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2019-03-23%2013%3A01%3A28

That's what I meant.

> Huh ... so that's nowhere near the jsonpath-syntax-error crash that
> we saw before.

Sorry, I wasn't aware there were multiple crashes.

> BTW, what is the difference between skate and snapper?  They look to
> be running on the same machine.

They are. Skate runs with default buildfarm options. Snapper mimics
the options used by the pgdg debian source packages. Both build the
same source (first skate then snapper). This to avoid a repeat of [1].
Snapper also runs more tests (UpgradeXversion, CollateLinuxUTF8).

Best regards,
Tom Turelinckx

1. https://www.postgresql.org/message-id/20160413175827.dmlbtdf7c3mgmnex%40alap3.anarazel.de





Re: jsonpath

From
Andrew Dunstan
Date:
On 3/31/19 12:21 PM, Tom Turelinckx wrote:
> Tom Lane wrote:
>
>> I assume this trace is from this run?
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skate&dt=2019-03-31%2006%3A24%3A35
> Yes. I did get a core file now, but it wasn't picked up by the buildfarm
> script, so I extracted the backtrace manually.
>
>

I have just committed a patch that should result in stack traces being
picked up in pg_upgrade testing. See
<https://github.com/PGBuildFarm/client-code/commit/51889e9dd86dd10f7b9444cb62eebb7f8baa989e>


You should be able to drop the updated file in place, get it from

<https://raw.githubusercontent.com/PGBuildFarm/client-code/51889e9dd86dd10f7b9444cb62eebb7f8baa989e/PGBuild/Modules/TestUpgrade.pm>


There will be a buildfarm release out very soon that includes this.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Andres Freund
Date:
On 2019-03-30 18:25:12 +0300, Alexander Korotkov wrote:
> I'm going to push there 3 attached patches for jsonpath.

I noticed that
https://commitfest.postgresql.org/22/1472/
https://commitfest.postgresql.org/22/1473/
are still open and marked as needs-review. Is there a reason for that?

- Andres



Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Apr 3, 2019 at 11:03 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2019-03-30 18:25:12 +0300, Alexander Korotkov wrote:
> > I'm going to push there 3 attached patches for jsonpath.
>
> I noticed that
> https://commitfest.postgresql.org/22/1472/
> https://commitfest.postgresql.org/22/1473/
> are still open and marked as needs-review. Is there a reason for that?

Nope.  I've moved both to the next CF.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, Mar 29, 2019 at 4:15 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Thu, Mar 28, 2019 at 7:43 PM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com> wrote:
> > On 3/28/19 9:50 AM, Tom Lane wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > >> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> Has anybody gotten through a valgrind run on this code yet?
> > >> Skink has successfully passed since - but that's x86...
> > > Yeah, there is a depressingly high chance that this is somehow specific
> > > to the bison version, flex version, and/or compiler in use on jacana.
> >
> > lousyjack has also passed it (x64).
> >
> > git bisect on jacana blames commit 550b9d26f.
>
> Hmm... 550b9d26f just makes jsonpath_gram.y and jsonpath_scan.l
> compile at once.  I've re-read this commit and didn't find anything
> suspicious.
> I've asked Andrew for access to jacana in order to investigate this myself.

Thanks to Andrew I got access to jacana and made some investigation.

At first, I found that existence of separate jsonpath_gram.h doesn't
influence the situation.  If have jsonpath_gram.h generated, test
still fails if compile jsonpath_gram.c and jsonpath_scan.c together.
But if build them separately, error is gone.

Then I did following trick: build jsonpath_gram.c and jsonpath_scan.c
separately, but copy contents of  jsonpath_gram.c to the top of
jsonpath_scan.c.  I also renamed yyparse to yyparse2 in the copy of
jsonpath_gram.c in order to make jsonpath_scan.c use another copy of
this function defined in the separate file.  Then test fails again.
After that, I found if I remove contents of yyparse2 function, then
test passes OK.  See versions of jsonpath_scan.c attached.

Thus, contents of unused function makes test fail or pass.  So far, it
looks like a compiler bug.  Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Thus, contents of unused function makes test fail or pass.  So far, it
> looks like a compiler bug.  Any thoughts?

Yeah :-(.  The fact that we've not seen a similar failure on any other
machines points in that direction, too.  Maybe it's some other aspect
of the machine's toolchain, like flex or bison, but that's not that
much different from our standpoint.

There's a lot of stuff I don't especially like about jsonpath_scan,
for instance I think the "init" arguments to resizeString etc are
a pretty error-prone and unmaintainable way to do things.  But
I don't see anything that looks like it'd be a portability hazard
that would explain this.

I still have a nagging feeling that there's a wild store somewhere
in here, but I don't know how to find it based on the limited
evidence we've got.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Sun, Apr 7, 2019 at 2:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > Thus, contents of unused function makes test fail or pass.  So far, it
> > looks like a compiler bug.  Any thoughts?
>
> Yeah :-(.  The fact that we've not seen a similar failure on any other
> machines points in that direction, too.  Maybe it's some other aspect
> of the machine's toolchain, like flex or bison, but that's not that
> much different from our standpoint.
>
> There's a lot of stuff I don't especially like about jsonpath_scan,
> for instance I think the "init" arguments to resizeString etc are
> a pretty error-prone and unmaintainable way to do things.  But
> I don't see anything that looks like it'd be a portability hazard
> that would explain this.
>
> I still have a nagging feeling that there's a wild store somewhere
> in here, but I don't know how to find it based on the limited
> evidence we've got.

Yeah, it might be not because compiler error.  It might depend on
memory layout.  So existence of extra function changes memory layout
and, in turn, causes an error.  I will try to disassemble
jsonpath_scan.o and see whether content of yyparse2 influences
assembly of other functions.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Tomas Vondra
Date:
On Sun, Apr 07, 2019 at 03:03:58AM +0300, Alexander Korotkov wrote:
>On Sun, Apr 7, 2019 at 2:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> > Thus, contents of unused function makes test fail or pass.  So far, it
>> > looks like a compiler bug.  Any thoughts?
>>
>> Yeah :-(.  The fact that we've not seen a similar failure on any other
>> machines points in that direction, too.  Maybe it's some other aspect
>> of the machine's toolchain, like flex or bison, but that's not that
>> much different from our standpoint.
>>
>> There's a lot of stuff I don't especially like about jsonpath_scan,
>> for instance I think the "init" arguments to resizeString etc are
>> a pretty error-prone and unmaintainable way to do things.  But
>> I don't see anything that looks like it'd be a portability hazard
>> that would explain this.
>>
>> I still have a nagging feeling that there's a wild store somewhere
>> in here, but I don't know how to find it based on the limited
>> evidence we've got.
>
>Yeah, it might be not because compiler error.  It might depend on
>memory layout.  So existence of extra function changes memory layout
>and, in turn, causes an error.  I will try to disassemble
>jsonpath_scan.o and see whether content of yyparse2 influences
>assembly of other functions.
>

Have you tried other compiler version / different optimization level? Or
running it under valgrind. Not sure how difficult that is on Windows.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Andrew Dunstan
Date:
On Tue, Apr 9, 2019 at 12:16 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>

> Have you tried other compiler version / different optimization level? Or
> running it under valgrind. Not sure how difficult that is on Windows.
>
>

It's not possible AFAIK. I will look at different compilers when I am
back local to the machine later today.

It doesn't help that jacana is currently the only buildfarm machine
building master on Mingw.

cheers

andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Andrew Dunstan
Date:
On 3/28/19 12:43 PM, Andrew Dunstan wrote:
> On 3/28/19 9:50 AM, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> On March 28, 2019 9:31:14 AM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Has anybody gotten through a valgrind run on this code yet?
>>> Skink has successfully passed since - but that's x86...
>> Yeah, there is a depressingly high chance that this is somehow specific
>> to the bison version, flex version, and/or compiler in use on jacana.
>>
>>             
>
>
> lousyjack has also passed it (x64).
>
>
> git bisect on jacana blames commit 550b9d26f.



OK, I have tried this with an earlier compiler (gcc 7.3.0 vs gcc 8.1.0)
and the problem disappears. So I think we can put this down as a
compiler bug. I will downgrade jacana to 7.3.0.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> OK, I have tried this with an earlier compiler (gcc 7.3.0 vs gcc 8.1.0)
> and the problem disappears. So I think we can put this down as a
> compiler bug. I will downgrade jacana to 7.3.0.

OK.

There is still the matter of those two failures on skate and snapper,
which don't look the same as jacana's issue but nonetheless seem to
be the jsonpath patch's fault.  However, since neither animal has yet
reproduced those failures, I don't know how to investigate further,
or whether it's even worth considering that to be an open item.

For the record, I wasted a fair amount of time last week trying to
duplicate the skate/snapper failures by setting up a sparc qemu VM
with more or less the same Debian version they're using.  No luck.
Recent versions of Debian Wheezy wouldn't boot; I did manage to get
7.6.0 to install and run, er crawl, but the failure didn't manifest
in quite a few tries.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Apr 9, 2019 at 7:16 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Sun, Apr 07, 2019 at 03:03:58AM +0300, Alexander Korotkov wrote:
> >On Sun, Apr 7, 2019 at 2:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> >> > Thus, contents of unused function makes test fail or pass.  So far, it
> >> > looks like a compiler bug.  Any thoughts?
> >>
> >> Yeah :-(.  The fact that we've not seen a similar failure on any other
> >> machines points in that direction, too.  Maybe it's some other aspect
> >> of the machine's toolchain, like flex or bison, but that's not that
> >> much different from our standpoint.
> >>
> >> There's a lot of stuff I don't especially like about jsonpath_scan,
> >> for instance I think the "init" arguments to resizeString etc are
> >> a pretty error-prone and unmaintainable way to do things.  But
> >> I don't see anything that looks like it'd be a portability hazard
> >> that would explain this.
> >>
> >> I still have a nagging feeling that there's a wild store somewhere
> >> in here, but I don't know how to find it based on the limited
> >> evidence we've got.
> >
> >Yeah, it might be not because compiler error.  It might depend on
> >memory layout.  So existence of extra function changes memory layout
> >and, in turn, causes an error.  I will try to disassemble
> >jsonpath_scan.o and see whether content of yyparse2 influences
> >assembly of other functions.
> >
>
> Have you tried other compiler version / different optimization level?

Error goes away with -O0.  Or I just didn't manage to reproduce that.
In my observation error depends on memory layout or something.  So, it
might be that I just didn't manage to reproduce it with -O0 while it
really still persists.  I didn't try other compilers yet.

> Or running it under valgrind. Not sure how difficult that is on Windows.

Valgrind isn't installed there.  I'm not sure how to do that, but I
will probably try.

The interesting thing is that on failure I got following backtrace.

#0  0x00007ff94f86a458 in ntdll!RtlRaiseStatus () from
C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007ff94f87760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007ff94dc42e1a in msvcrt!_setjmpex () from
C:\WINDOWS\System32\msvcrt.dll
#3  0x000000000086a37a in pg_re_throw () at elog.c:1720
#4  0x000000000086a166 in errfinish (dummy=<optimized out>) at elog.c:464
#5  0x00000000007c3d18 in jsonpath_yyerror (result=result@entry=0x0,
    message=message@entry=0xa87d38 <__func__.110220+1512>
"unrecognized flag of LIKE_REGEX predicate") at jsonpath_scan.l:276
#6  0x00000000007c5f3d in makeItemLikeRegex (pattern=<optimized out>,
pattern=<optimized out>, flags=<optimized out>, expr=0x7216760) at
jsonpath_gram.y:500
#7  jsonpath_yyparse (result=<optimized out>, result@entry=0x495e818)
at jsonpath_gram.y:178

So, error happens inside implementation of siglongjmp().  I've checked
that contents of *PG_exception_stack didn't change since previous
successfully thrown error.  Probably this implementation of long jump
saves some part of state outside of sigjmp_buf and that part is
corrupt.  Any ideas?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Andrew Dunstan
Date:
On 4/14/19 10:43 PM, Alexander Korotkov wrote:
> On Tue, Apr 9, 2019 at 7:16 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> On Sun, Apr 07, 2019 at 03:03:58AM +0300, Alexander Korotkov wrote:
>>> On Sun, Apr 7, 2019 at 2:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>>>>> Thus, contents of unused function makes test fail or pass.  So far, it
>>>>> looks like a compiler bug.  Any thoughts?
>>>> Yeah :-(.  The fact that we've not seen a similar failure on any other
>>>> machines points in that direction, too.  Maybe it's some other aspect
>>>> of the machine's toolchain, like flex or bison, but that's not that
>>>> much different from our standpoint.
>>>>
>>>> There's a lot of stuff I don't especially like about jsonpath_scan,
>>>> for instance I think the "init" arguments to resizeString etc are
>>>> a pretty error-prone and unmaintainable way to do things.  But
>>>> I don't see anything that looks like it'd be a portability hazard
>>>> that would explain this.
>>>>
>>>> I still have a nagging feeling that there's a wild store somewhere
>>>> in here, but I don't know how to find it based on the limited
>>>> evidence we've got.
>>> Yeah, it might be not because compiler error.  It might depend on
>>> memory layout.  So existence of extra function changes memory layout
>>> and, in turn, causes an error.  I will try to disassemble
>>> jsonpath_scan.o and see whether content of yyparse2 influences
>>> assembly of other functions.
>>>
>> Have you tried other compiler version / different optimization level?
> Error goes away with -O0.  Or I just didn't manage to reproduce that.
> In my observation error depends on memory layout or something.  So, it
> might be that I just didn't manage to reproduce it with -O0 while it
> really still persists.  I didn't try other compilers yet.
>
>> Or running it under valgrind. Not sure how difficult that is on Windows.
> Valgrind isn't installed there.  I'm not sure how to do that, but I
> will probably try.
>
> The interesting thing is that on failure I got following backtrace.
>
> #0  0x00007ff94f86a458 in ntdll!RtlRaiseStatus () from
> C:\WINDOWS\SYSTEM32\ntdll.dll
> #1  0x00007ff94f87760e in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll
> #2  0x00007ff94dc42e1a in msvcrt!_setjmpex () from
> C:\WINDOWS\System32\msvcrt.dll
> #3  0x000000000086a37a in pg_re_throw () at elog.c:1720
> #4  0x000000000086a166 in errfinish (dummy=<optimized out>) at elog.c:464
> #5  0x00000000007c3d18 in jsonpath_yyerror (result=result@entry=0x0,
>     message=message@entry=0xa87d38 <__func__.110220+1512>
> "unrecognized flag of LIKE_REGEX predicate") at jsonpath_scan.l:276
> #6  0x00000000007c5f3d in makeItemLikeRegex (pattern=<optimized out>,
> pattern=<optimized out>, flags=<optimized out>, expr=0x7216760) at
> jsonpath_gram.y:500
> #7  jsonpath_yyparse (result=<optimized out>, result@entry=0x495e818)
> at jsonpath_gram.y:178
>
> So, error happens inside implementation of siglongjmp().  I've checked
> that contents of *PG_exception_stack didn't change since previous
> successfully thrown error.  Probably this implementation of long jump
> saves some part of state outside of sigjmp_buf and that part is
> corrupt.  Any ideas?
>

I have downgraded jacana to gcc 7.3.0, which has resolved the problem.
I'm still a bit worried that we're clobbering the stack somehow though.
I have no idea how to test that.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonpath

From
John Naylor
Date:
Attached is a patch to fix some minor issues:

-misspelling of an error message
-Commit 550b9d26f80f failed to update the Makefile comment to reflect
how the build changed, and also removed the clean target, which we now
have use for since we later got rid of backtracking in the scanner.

Also, while I have the thought in my head, for v13 we should consider
replacing the keyword binary search with the perfect hash technique
added in c64d0cd5ce2 -- it might give a small performance boost to the
scanner.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: jsonpath

From
Tom Lane
Date:
John Naylor <john.naylor@2ndquadrant.com> writes:
> Attached is a patch to fix some minor issues:
> -misspelling of an error message

Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
needs a sweep to bring its error messages into line with our style
guidelines, but no harm in starting with the obvious bugs.

> -Commit 550b9d26f80f failed to update the Makefile comment to reflect
> how the build changed, and also removed the clean target, which we now
> have use for since we later got rid of backtracking in the scanner.

Right.  I'm not really sure why we're bothering with anti-backtracking
here, or with using speed-rather-than-code-space lexer optimization
options.  It's hard for me to credit that any practically-useful jsonpath
pattern would be long enough for lexer speed to matter, and even harder to
credit that the speed of the flex code itself would be an important factor
in the overall processing cost of a long jsonpath.  Still, as long as we
have the code it needs to be right.

> Also, while I have the thought in my head, for v13 we should consider
> replacing the keyword binary search with the perfect hash technique
> added in c64d0cd5ce2 -- it might give a small performance boost to the
> scanner.

I doubt it's worth the trouble, per above.

Patch LGTM, pushed.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Apr 17, 2019 at 8:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> John Naylor <john.naylor@2ndquadrant.com> writes:
> > Attached is a patch to fix some minor issues:
> > -misspelling of an error message
>
> Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> needs a sweep to bring its error messages into line with our style
> guidelines, but no harm in starting with the obvious bugs.

I'll go trough the jsonpath error messages and post a patch for fixing them.

> > -Commit 550b9d26f80f failed to update the Makefile comment to reflect
> > how the build changed, and also removed the clean target, which we now
> > have use for since we later got rid of backtracking in the scanner.
>
> Right.  I'm not really sure why we're bothering with anti-backtracking
> here, or with using speed-rather-than-code-space lexer optimization
> options.  It's hard for me to credit that any practically-useful jsonpath
> pattern would be long enough for lexer speed to matter, and even harder to
> credit that the speed of the flex code itself would be an important factor
> in the overall processing cost of a long jsonpath.  Still, as long as we
> have the code it needs to be right.

Actually I found that non of in-core lexers are backtracking.  So, I
understood no backtracking as kind of standard and didn't want to
break that :)

Nevertheless, I could imagine use-case involving parsing a lot of
jsonpath'es.  For example we may construct jsonpath based on table
data and check that for just few jsonb's.  For sure, that wouldn't be
a common use-case, but still.

> > Also, while I have the thought in my head, for v13 we should consider
> > replacing the keyword binary search with the perfect hash technique
> > added in c64d0cd5ce2 -- it might give a small performance boost to the
> > scanner.
>
> I doubt it's worth the trouble, per above.
>
> Patch LGTM, pushed.

Thank you for pushing this!

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
John Naylor
Date:
On Thu, Apr 18, 2019 at 1:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@2ndquadrant.com> writes:
> > Attached is a patch to fix some minor issues:
> > -misspelling of an error message
>
> Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> needs a sweep to bring its error messages into line with our style
> guidelines, but no harm in starting with the obvious bugs.
>
> > -Commit 550b9d26f80f failed to update the Makefile comment to reflect
> > how the build changed, and also removed the clean target, which we now
> > have use for since we later got rid of backtracking in the scanner.
>
> Right.  I'm not really sure why we're bothering with anti-backtracking
> here, or with using speed-rather-than-code-space lexer optimization
> options.  It's hard for me to credit that any practically-useful jsonpath
> pattern would be long enough for lexer speed to matter, and even harder to
> credit that the speed of the flex code itself would be an important factor
> in the overall processing cost of a long jsonpath.  Still, as long as we
> have the code it needs to be right.

I was wondering about that. I measured the current size of
yy_transition to be 36492 on my machine. With the flag -Cfe, which
gives the smallest representation without backtracking, yy_nxt is 6336
(there is no yy_transition). I'd say that's a large enough difference
that we'd want the smaller representation if it makes little
difference in performance.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Apr 17, 2019 at 11:14 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Apr 17, 2019 at 8:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > John Naylor <john.naylor@2ndquadrant.com> writes:
> > > Attached is a patch to fix some minor issues:
> > > -misspelling of an error message
> >
> > Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> > needs a sweep to bring its error messages into line with our style
> > guidelines, but no harm in starting with the obvious bugs.
>
> I'll go trough the jsonpath error messages and post a patch for fixing them.
>
> > > -Commit 550b9d26f80f failed to update the Makefile comment to reflect
> > > how the build changed, and also removed the clean target, which we now
> > > have use for since we later got rid of backtracking in the scanner.
> >
> > Right.  I'm not really sure why we're bothering with anti-backtracking
> > here, or with using speed-rather-than-code-space lexer optimization
> > options.  It's hard for me to credit that any practically-useful jsonpath
> > pattern would be long enough for lexer speed to matter, and even harder to
> > credit that the speed of the flex code itself would be an important factor
> > in the overall processing cost of a long jsonpath.  Still, as long as we
> > have the code it needs to be right.
>
> Actually I found that non of in-core lexers are backtracking.  So, I
> understood no backtracking as kind of standard and didn't want to
> break that :)
>
> Nevertheless, I could imagine use-case involving parsing a lot of
> jsonpath'es.  For example we may construct jsonpath based on table
> data and check that for just few jsonb's.  For sure, that wouldn't be
> a common use-case, but still.
>
> > > Also, while I have the thought in my head, for v13 we should consider
> > > replacing the keyword binary search with the perfect hash technique
> > > added in c64d0cd5ce2 -- it might give a small performance boost to the
> > > scanner.
> >
> > I doubt it's worth the trouble, per above.
> >
> > Patch LGTM, pushed.
>
> Thank you for pushing this!

I went trough the jsonpath errors and made some corrections.  See the
attached patch.

One thing makes me uneasy.  jsonpath_yyerror() substitutes its
"message" argument to the errdetail().  Out style guide requires
errdetails to be complete sentences, while bison passes
non-capitalized error messages.  Should we bother capitalize first
character of "message"?  I wonder how "portable" this solution would
be for various languages.  cubescan.l reports error in the similar way
to jsonpath and doesn't bother about making errdetail a complete
sentence.  Or should we move bison error to errmsg()?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Apr 18, 2019 at 4:09 AM John Naylor <john.naylor@2ndquadrant.com> wrote:
> On Thu, Apr 18, 2019 at 1:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > John Naylor <john.naylor@2ndquadrant.com> writes:
> > > Attached is a patch to fix some minor issues:
> > > -misspelling of an error message
> >
> > Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> > needs a sweep to bring its error messages into line with our style
> > guidelines, but no harm in starting with the obvious bugs.
> >
> > > -Commit 550b9d26f80f failed to update the Makefile comment to reflect
> > > how the build changed, and also removed the clean target, which we now
> > > have use for since we later got rid of backtracking in the scanner.
> >
> > Right.  I'm not really sure why we're bothering with anti-backtracking
> > here, or with using speed-rather-than-code-space lexer optimization
> > options.  It's hard for me to credit that any practically-useful jsonpath
> > pattern would be long enough for lexer speed to matter, and even harder to
> > credit that the speed of the flex code itself would be an important factor
> > in the overall processing cost of a long jsonpath.  Still, as long as we
> > have the code it needs to be right.
>
> I was wondering about that. I measured the current size of
> yy_transition to be 36492 on my machine. With the flag -Cfe, which
> gives the smallest representation without backtracking, yy_nxt is 6336
> (there is no yy_transition). I'd say that's a large enough difference
> that we'd want the smaller representation if it makes little
> difference in performance.

Did I understand correctly that you've tried the same version of
jsonpath_scan.l with different flex flags?  Did you also notice if
changes 1d88a75c made to jsonpath_scan.l have singnificant influence?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>> On Wed, Apr 17, 2019 at 8:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
>>> needs a sweep to bring its error messages into line with our style
>>> guidelines, but no harm in starting with the obvious bugs.

> I went trough the jsonpath errors and made some corrections.  See the
> attached patch.

Please don't do this sort of change:

-            elog(ERROR, "unrecognized jsonpath item type: %d", item->type);
+            ereport(ERROR,
+                    (errcode(ERRCODE_INTERNAL_ERROR),
+                     errmsg("unrecognized jsonpath item type: %d", item->type)));

elog() is the appropriate thing for shouldn't-happen internal errors like
these.  The only thing you've changed here, aside from making the source
code longer, is to expose the error message for translation ... which is
really just wasting translators' time.  Only messages we actually think
users might need to deal with should be exposed for translation.

@@ -623,7 +624,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
                     ereport(ERROR,
                             (errcode(ERRCODE_JSON_MEMBER_NOT_FOUND), \
                              errmsg(ERRMSG_JSON_MEMBER_NOT_FOUND),
-                             errdetail("JSON object does not contain key %s",
+                             errdetail("JSON object does not contain key %s.",
                                        keybuf.data)));
                 }
             }

OK as far as it went, but you should also put double quotes around the %s.
(I also noticed some messages that are using single-quotes around
interpolated strings, which is not the project standard either.)

Other specific things I wanted to see fixed:

* jsonpath_scan.l has some messages like "bad ..." which is not project
style; use "invalid" or "unrecognized".  (There's probably no good
reason not to use the same string "invalid input syntax for type jsonpath"
that is used elsewhere.)

* This in jsonpath_gram.y is quite unhelpful:

                yyerror(NULL, "unrecognized flag of LIKE_REGEX predicate");

since it doesn't tell you what flag character it doesn't like
(and the error positioning info isn't accurate enough to let the
user figure that out).  It really needs to be something more like
"unrecognized flag character \"%c\" in LIKE_REGEX predicate".
That probably means you can't use yyerror for this, but I don't
think yyerror was providing any useful functionality anyway :-(


More generally, I'm not very much on board with this coding technique:

/* Standard error message for SQL/JSON errors */
#define ERRMSG_JSON_ARRAY_NOT_FOUND            "SQL/JSON array not found"

...

                RETURN_ERROR(ereport(ERROR,
                                     (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
                                      errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
                                      errdetail("Jsonpath wildcard array accessor "

In the first place, I'm not certain that this will result in the error
message being translatable --- do the gettext tools know how to expand
macros?

In the second place, the actual strings are just restatements of their
ERRMSG macro names, which IMO is not conformant to our message style,
but it's too hard to see that from source code like this.  Also this
style is pretty unworkable/unfriendly if the message needs to contain
any %-markers, so I suspect that having a coding style like this may be
discouraging you from providing values in places where it'd be helpful to
do so.  What I actually see happening as a consequence of this approach is
that you're pushing the useful information off to an errdetail, which is
not really helpful and it's not per project style either.  The idea is to
make the primary message as helpful as possible without being long, not
to make it a simple restatement of the SQLSTATE that nobody can understand
without also looking at the errdetail.

In the third place, this makes it hard for people to grep for occurrences
of an error string in our source code.

And in the fourth place, we don't do this elsewhere; it does not help
anybody for jsonpath to invent its own coding conventions that are unlike
the rest of Postgres.

So I think you should drop the ERRMSG_xxx macros, write out these error
messages where they are used, and rethink your use of errmsg vs. errdetail.

Along the same line of not making it unnecessarily hard for people to grep
for error texts, it's best not to split texts across lines like this:

        RETURN_ERROR(ereport(ERROR,
                             (errcode(ERRCODE_INVALID_JSON_SUBSCRIPT),
                              errmsg(ERRMSG_INVALID_JSON_SUBSCRIPT),
                              errdetail("Jsonpath array subscript is not a "
                                        "singleton numeric value."))));

Somebody grepping for "not a singleton" would not get a hit on that, which
could be quite misleading if they do get hits elsewhere.  I think for the
most part people have decided that it's better to have overly long source
lines than to break up error message literals.  It's especially pointless
to break up source lines when the result still doesn't fit in 80 columns.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
Hi!

Thank you for your review!

On Mon, Apr 22, 2019 at 1:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
>                 RETURN_ERROR(ereport(ERROR,
>                                      (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
>                                       errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
>                                       errdetail("Jsonpath wildcard array accessor "
>
> In the first place, I'm not certain that this will result in the error
> message being translatable --- do the gettext tools know how to expand
> macros?
>
> In the second place, the actual strings are just restatements of their
> ERRMSG macro names, which IMO is not conformant to our message style,
> but it's too hard to see that from source code like this.  Also this
> style is pretty unworkable/unfriendly if the message needs to contain
> any %-markers, so I suspect that having a coding style like this may be
> discouraging you from providing values in places where it'd be helpful to
> do so.  What I actually see happening as a consequence of this approach is
> that you're pushing the useful information off to an errdetail, which is
> not really helpful and it's not per project style either.  The idea is to
> make the primary message as helpful as possible without being long, not
> to make it a simple restatement of the SQLSTATE that nobody can understand
> without also looking at the errdetail.
>
> In the third place, this makes it hard for people to grep for occurrences
> of an error string in our source code.
>
> And in the fourth place, we don't do this elsewhere; it does not help
> anybody for jsonpath to invent its own coding conventions that are unlike
> the rest of Postgres.

Just to clarify things.  Do you propose to get rid of RETURN_ERROR()
macro by expanding it at every occurrence?  Or do you have other ideas
in the mind?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
John Naylor
Date:
On Sun, Apr 21, 2019 at 2:01 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Thu, Apr 18, 2019 at 4:09 AM John Naylor <john.naylor@2ndquadrant.com> wrote:
> > I was wondering about that. I measured the current size of
> > yy_transition to be 36492 on my machine. With the flag -Cfe, which
> > gives the smallest representation without backtracking, yy_nxt is 6336
> > (there is no yy_transition). I'd say that's a large enough difference
> > that we'd want the smaller representation if it makes little
> > difference in performance.
>
> Did I understand correctly that you've tried the same version of
> jsonpath_scan.l with different flex flags?

Correct.

> Did you also notice if
> changes 1d88a75c made to jsonpath_scan.l have singnificant influence?

Trying the same measurements above with backtracking put back in,
jsonpath_yylex was actually larger by a few hundred bytes, and there
was almost no difference in the transition/nxt tables.


--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Just to clarify things.  Do you propose to get rid of RETURN_ERROR()
> macro by expanding it at every occurrence?  Or do you have other ideas
> in the mind?

I wasn't really complaining about RETURN_ERROR() --- it was the
macros rather than literal strings for the errmsg() texts that
was bothering me.

Mind you, I'm not really sure about RETURN_ERROR --- it looks
a little weird to have something that appears to be doing something
with the value of ereport(), which hasn't got a value.  But I don't
have a better idea at the moment.  I doubt that writing out the
underlying ereport-or-return business at each spot would be any
more readable.

(Maybe spelling it RETURN_OR_ERROR, or vice versa, would help?
Not sure.)

            regards, tom lane



Re: jsonpath

From
Alvaro Herrera
Date:
On 2019-Apr-22, Alexander Korotkov wrote:

> Hi!
> 
> Thank you for your review!
> 
> On Mon, Apr 22, 2019 at 1:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> >                 RETURN_ERROR(ereport(ERROR,
> >                                      (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
> >                                       errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
> >                                       errdetail("Jsonpath wildcard array accessor "
> >
> > In the first place, I'm not certain that this will result in the error
> > message being translatable --- do the gettext tools know how to expand
> > macros?
> >
> > In the second place, the actual strings are just restatements of their
> > ERRMSG macro names, which IMO is not conformant to our message style,
> > but it's too hard to see that from source code like this.  Also this
> > style is pretty unworkable/unfriendly if the message needs to contain
> > any %-markers, so I suspect that having a coding style like this may be
> > discouraging you from providing values in places where it'd be helpful to
> > do so.  What I actually see happening as a consequence of this approach is
> > that you're pushing the useful information off to an errdetail, which is
> > not really helpful and it's not per project style either.  The idea is to
> > make the primary message as helpful as possible without being long, not
> > to make it a simple restatement of the SQLSTATE that nobody can understand
> > without also looking at the errdetail.
> >
> > In the third place, this makes it hard for people to grep for occurrences
> > of an error string in our source code.
> >
> > And in the fourth place, we don't do this elsewhere; it does not help
> > anybody for jsonpath to invent its own coding conventions that are unlike
> > the rest of Postgres.
> 
> Just to clarify things.  Do you propose to get rid of RETURN_ERROR()
> macro by expanding it at every occurrence?  Or do you have other ideas
> in the mind?

I think he's not talking about the RETURN_ERROR macro, but about the
ERRMSG_JSON_ARRAY_NOT_FOUND macro.  The PG convention is to repeat the
message literal in every place instead of defining a macro with the
literal.  But at the same time, using the same errmsg() and only vary
the errdetail() is unhelpful, so we want most detail in the errmsg
instead; I think it'd be something like this:

ereport(ERROR,
    (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
     errmsg("%s wildcard array accessor not found", "jsonpath")));

note I put the type name "jsonpath" in a separate literal, so that only
the interesting part is seen by translators.

I don't think Tom said anything about the RETURN_ERROR macro.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: jsonpath

From
Alexander Korotkov
Date:
On Mon, Apr 22, 2019 at 1:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> >> On Wed, Apr 17, 2019 at 8:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> >>> needs a sweep to bring its error messages into line with our style
> >>> guidelines, but no harm in starting with the obvious bugs.
>
> > I went trough the jsonpath errors and made some corrections.  See the
> > attached patch.
>
> Please don't do this sort of change:
>
> -            elog(ERROR, "unrecognized jsonpath item type: %d", item->type);
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_INTERNAL_ERROR),
> +                     errmsg("unrecognized jsonpath item type: %d", item->type)));
>
> elog() is the appropriate thing for shouldn't-happen internal errors like
> these.  The only thing you've changed here, aside from making the source
> code longer, is to expose the error message for translation ... which is
> really just wasting translators' time.  Only messages we actually think
> users might need to deal with should be exposed for translation.

Makes sense.  Removed from the patch.

> @@ -623,7 +624,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
>                      ereport(ERROR,
>                              (errcode(ERRCODE_JSON_MEMBER_NOT_FOUND), \
>                               errmsg(ERRMSG_JSON_MEMBER_NOT_FOUND),
> -                             errdetail("JSON object does not contain key %s",
> +                             errdetail("JSON object does not contain key %s.",
>                                         keybuf.data)));
>                  }
>              }
>
> OK as far as it went, but you should also put double quotes around the %s.

This code actually passed key trough escape_json(), which adds double
quotes itself.  However, we don't do such transformation in other
places.  So, patch removes call of ecsape_json() while putting double
quotes to the error message.

> (I also noticed some messages that are using single-quotes around
> interpolated strings, which is not the project standard either.)

Single-quotes are replaced with double-quotes.

> Other specific things I wanted to see fixed:
>
> * jsonpath_scan.l has some messages like "bad ..." which is not project
> style; use "invalid" or "unrecognized".  (There's probably no good
> reason not to use the same string "invalid input syntax for type jsonpath"
> that is used elsewhere.)

Fixed.

> * This in jsonpath_gram.y is quite unhelpful:
>
>                 yyerror(NULL, "unrecognized flag of LIKE_REGEX predicate");
>
> since it doesn't tell you what flag character it doesn't like
> (and the error positioning info isn't accurate enough to let the
> user figure that out).  It really needs to be something more like
> "unrecognized flag character \"%c\" in LIKE_REGEX predicate".
> That probably means you can't use yyerror for this, but I don't
> think yyerror was providing any useful functionality anyway :-(

Fixed.

> More generally, I'm not very much on board with this coding technique:
>
> /* Standard error message for SQL/JSON errors */
> #define ERRMSG_JSON_ARRAY_NOT_FOUND            "SQL/JSON array not found"
>
> ...
>
>                 RETURN_ERROR(ereport(ERROR,
>                                      (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
>                                       errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
>                                       errdetail("Jsonpath wildcard array accessor "
>
> In the first place, I'm not certain that this will result in the error
> message being translatable --- do the gettext tools know how to expand
> macros?
>
> In the second place, the actual strings are just restatements of their
> ERRMSG macro names, which IMO is not conformant to our message style,
> but it's too hard to see that from source code like this.  Also this
> style is pretty unworkable/unfriendly if the message needs to contain
> any %-markers, so I suspect that having a coding style like this may be
> discouraging you from providing values in places where it'd be helpful to
> do so.  What I actually see happening as a consequence of this approach is
> that you're pushing the useful information off to an errdetail, which is
> not really helpful and it's not per project style either.  The idea is to
> make the primary message as helpful as possible without being long, not
> to make it a simple restatement of the SQLSTATE that nobody can understand
> without also looking at the errdetail.
>
> In the third place, this makes it hard for people to grep for occurrences
> of an error string in our source code.
>
> And in the fourth place, we don't do this elsewhere; it does not help
> anybody for jsonpath to invent its own coding conventions that are unlike
> the rest of Postgres.
>
> So I think you should drop the ERRMSG_xxx macros, write out these error
> messages where they are used, and rethink your use of errmsg vs. errdetail.

OK, ERRMSG_* macros are removed.

> Along the same line of not making it unnecessarily hard for people to grep
> for error texts, it's best not to split texts across lines like this:
>
>         RETURN_ERROR(ereport(ERROR,
>                              (errcode(ERRCODE_INVALID_JSON_SUBSCRIPT),
>                               errmsg(ERRMSG_INVALID_JSON_SUBSCRIPT),
>                               errdetail("Jsonpath array subscript is not a "
>                                         "singleton numeric value."))));
>
> Somebody grepping for "not a singleton" would not get a hit on that, which
> could be quite misleading if they do get hits elsewhere.  I think for the
> most part people have decided that it's better to have overly long source
> lines than to break up error message literals.  It's especially pointless
> to break up source lines when the result still doesn't fit in 80 columns.

OK, now no line breaks in error messages.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Wed, Apr 24, 2019 at 9:03 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Mon, Apr 22, 2019 at 1:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > >> On Wed, Apr 17, 2019 at 8:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >>> Yeah, I'd noticed that one too :-(.  I think the whole jsonpath patch
> > >>> needs a sweep to bring its error messages into line with our style
> > >>> guidelines, but no harm in starting with the obvious bugs.
> >
> > > I went trough the jsonpath errors and made some corrections.  See the
> > > attached patch.
> >
> > Please don't do this sort of change:
> >
> > -            elog(ERROR, "unrecognized jsonpath item type: %d", item->type);
> > +            ereport(ERROR,
> > +                    (errcode(ERRCODE_INTERNAL_ERROR),
> > +                     errmsg("unrecognized jsonpath item type: %d", item->type)));
> >
> > elog() is the appropriate thing for shouldn't-happen internal errors like
> > these.  The only thing you've changed here, aside from making the source
> > code longer, is to expose the error message for translation ... which is
> > really just wasting translators' time.  Only messages we actually think
> > users might need to deal with should be exposed for translation.
>
> Makes sense.  Removed from the patch.
>
> > @@ -623,7 +624,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
> >                      ereport(ERROR,
> >                              (errcode(ERRCODE_JSON_MEMBER_NOT_FOUND), \
> >                               errmsg(ERRMSG_JSON_MEMBER_NOT_FOUND),
> > -                             errdetail("JSON object does not contain key %s",
> > +                             errdetail("JSON object does not contain key %s.",
> >                                         keybuf.data)));
> >                  }
> >              }
> >
> > OK as far as it went, but you should also put double quotes around the %s.
>
> This code actually passed key trough escape_json(), which adds double
> quotes itself.  However, we don't do such transformation in other
> places.  So, patch removes call of ecsape_json() while putting double
> quotes to the error message.
>
> > (I also noticed some messages that are using single-quotes around
> > interpolated strings, which is not the project standard either.)
>
> Single-quotes are replaced with double-quotes.
>
> > Other specific things I wanted to see fixed:
> >
> > * jsonpath_scan.l has some messages like "bad ..." which is not project
> > style; use "invalid" or "unrecognized".  (There's probably no good
> > reason not to use the same string "invalid input syntax for type jsonpath"
> > that is used elsewhere.)
>
> Fixed.
>
> > * This in jsonpath_gram.y is quite unhelpful:
> >
> >                 yyerror(NULL, "unrecognized flag of LIKE_REGEX predicate");
> >
> > since it doesn't tell you what flag character it doesn't like
> > (and the error positioning info isn't accurate enough to let the
> > user figure that out).  It really needs to be something more like
> > "unrecognized flag character \"%c\" in LIKE_REGEX predicate".
> > That probably means you can't use yyerror for this, but I don't
> > think yyerror was providing any useful functionality anyway :-(
>
> Fixed.
>
> > More generally, I'm not very much on board with this coding technique:
> >
> > /* Standard error message for SQL/JSON errors */
> > #define ERRMSG_JSON_ARRAY_NOT_FOUND            "SQL/JSON array not found"
> >
> > ...
> >
> >                 RETURN_ERROR(ereport(ERROR,
> >                                      (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
> >                                       errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
> >                                       errdetail("Jsonpath wildcard array accessor "
> >
> > In the first place, I'm not certain that this will result in the error
> > message being translatable --- do the gettext tools know how to expand
> > macros?
> >
> > In the second place, the actual strings are just restatements of their
> > ERRMSG macro names, which IMO is not conformant to our message style,
> > but it's too hard to see that from source code like this.  Also this
> > style is pretty unworkable/unfriendly if the message needs to contain
> > any %-markers, so I suspect that having a coding style like this may be
> > discouraging you from providing values in places where it'd be helpful to
> > do so.  What I actually see happening as a consequence of this approach is
> > that you're pushing the useful information off to an errdetail, which is
> > not really helpful and it's not per project style either.  The idea is to
> > make the primary message as helpful as possible without being long, not
> > to make it a simple restatement of the SQLSTATE that nobody can understand
> > without also looking at the errdetail.
> >
> > In the third place, this makes it hard for people to grep for occurrences
> > of an error string in our source code.
> >
> > And in the fourth place, we don't do this elsewhere; it does not help
> > anybody for jsonpath to invent its own coding conventions that are unlike
> > the rest of Postgres.
> >
> > So I think you should drop the ERRMSG_xxx macros, write out these error
> > messages where they are used, and rethink your use of errmsg vs. errdetail.
>
> OK, ERRMSG_* macros are removed.
>
> > Along the same line of not making it unnecessarily hard for people to grep
> > for error texts, it's best not to split texts across lines like this:
> >
> >         RETURN_ERROR(ereport(ERROR,
> >                              (errcode(ERRCODE_INVALID_JSON_SUBSCRIPT),
> >                               errmsg(ERRMSG_INVALID_JSON_SUBSCRIPT),
> >                               errdetail("Jsonpath array subscript is not a "
> >                                         "singleton numeric value."))));
> >
> > Somebody grepping for "not a singleton" would not get a hit on that, which
> > could be quite misleading if they do get hits elsewhere.  I think for the
> > most part people have decided that it's better to have overly long source
> > lines than to break up error message literals.  It's especially pointless
> > to break up source lines when the result still doesn't fit in 80 columns.
>
> OK, now no line breaks in error messages.

I'm going to commit these adjustments if no objections.

My question regarding jsonpath_yyerror() vs. bison errors is still
relevant.  Should we bother making bison-based errdetail() a complete
sentence starting from uppercase character?  If not, should we make
other yyerror() calls look the same?  Or should we rather move bison
error from errdetail() to errmsg()?

I would probably leave bison-based messages "as is", but make messages
in other invocations of yyerror() starting from lowercase characters
for the uniformity.  But I'd like to hear other opinions.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> I'm going to commit these adjustments if no objections.

Sorry for not getting to this sooner.  Looking quickly at the v2 patch,
it seems like you didn't entirely take to heart the idea of preferring
a useful primary error message over a boilerplate primary with errdetail.
In particular, in places like

-                 errmsg(ERRMSG_SINGLETON_JSON_ITEM_REQUIRED),
-                 errdetail("expression should return a singleton boolean")));
+                 errmsg("singleton SQL/JSON item required"),
+                 errdetail("Singleton boolean result is expected.")));

why bother with errdetail at all?  It's not conveying any useful increment
of information.  In this example I think

                 errmsg("expression should return a singleton boolean")

is sufficient and well-phrased.  Likewise, a bit further down,

+                             errmsg("SQL/JSON member not found"),
+                             errdetail("JSON object does not contain key \"%s\".",

there is nothing being said here that wouldn't fit perfectly well into
one errmsg.

> My question regarding jsonpath_yyerror() vs. bison errors is still
> relevant.  Should we bother making bison-based errdetail() a complete
> sentence starting from uppercase character?  If not, should we make
> other yyerror() calls look the same?  Or should we rather move bison
> error from errdetail() to errmsg()?

The latter I think.  The core lexer just presents the yyerror message
as primary:

scanner_yyerror(const char *message, core_yyscan_t yyscanner)
{
        ...
        ereport(ERROR,
                (errcode(ERRCODE_SYNTAX_ERROR),
        /* translator: %s is typically the translation of "syntax error" */
                 errmsg("%s at end of input", _(message)),
                 lexer_errposition()));

and in a quick look at what jsonpath is doing, I'm not really seeing
the point of it being different.  You could do something like

        /* translator: %s is typically the translation of "syntax error" */
                 errmsg("%s in jsonpath input", _(message))

to preserve the information that this is about jsonpath, but beyond
that I don't see that splitting off an errdetail is helping much.

Or, perhaps, provide an errdetail giving the full jsonpath input string?
That might or might not be redundant with other context information,
so I'm not sure how useful it'd be.

Anyway, my main criticism is still that this is way too eager to use
an errdetail message when it could perfectly well fit all the info
into the primary error.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Thu, Apr 25, 2019 at 10:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > I'm going to commit these adjustments if no objections.
>
> Sorry for not getting to this sooner.  Looking quickly at the v2 patch,
> it seems like you didn't entirely take to heart the idea of preferring
> a useful primary error message over a boilerplate primary with errdetail.
> In particular, in places like
>
> -                 errmsg(ERRMSG_SINGLETON_JSON_ITEM_REQUIRED),
> -                 errdetail("expression should return a singleton boolean")));
> +                 errmsg("singleton SQL/JSON item required"),
> +                 errdetail("Singleton boolean result is expected.")));
>
> why bother with errdetail at all?  It's not conveying any useful increment
> of information.  In this example I think
>
>                  errmsg("expression should return a singleton boolean")
>
> is sufficient and well-phrased.  Likewise, a bit further down,
>
> +                             errmsg("SQL/JSON member not found"),
> +                             errdetail("JSON object does not contain key \"%s\".",
>
> there is nothing being said here that wouldn't fit perfectly well into
> one errmsg.

Makes sense.  Attached revision of patch gets rid of errdetail() where
it seems to be appropriate.

> > My question regarding jsonpath_yyerror() vs. bison errors is still
> > relevant.  Should we bother making bison-based errdetail() a complete
> > sentence starting from uppercase character?  If not, should we make
> > other yyerror() calls look the same?  Or should we rather move bison
> > error from errdetail() to errmsg()?
>
> The latter I think.  The core lexer just presents the yyerror message
> as primary:
>
> scanner_yyerror(const char *message, core_yyscan_t yyscanner)
> {
>         ...
>         ereport(ERROR,
>                 (errcode(ERRCODE_SYNTAX_ERROR),
>         /* translator: %s is typically the translation of "syntax error" */
>                  errmsg("%s at end of input", _(message)),
>                  lexer_errposition()));
>
> and in a quick look at what jsonpath is doing, I'm not really seeing
> the point of it being different.  You could do something like
>
>         /* translator: %s is typically the translation of "syntax error" */
>                  errmsg("%s in jsonpath input", _(message))
>
> to preserve the information that this is about jsonpath, but beyond
> that I don't see that splitting off an errdetail is helping much.

I've moved error message into errmsg().

> Or, perhaps, provide an errdetail giving the full jsonpath input string?
> That might or might not be redundant with other context information,
> so I'm not sure how useful it'd be.

I'm also not sure about this.  Didn't do this for now.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> [ jsonpath-errors-improve-3.patch ]

This is getting better, but IMO it's still a bit too willing to use
a boilerplate primary error message plus errdetail.  I do not think
that is project style nor something to be encouraged.

In particular, you've got a whole lot of cases like this:

+                                      errmsg("JSON object is expected"),
+                                      errdetail("Jsonpath member accessor can only be applied to an object."))));

I think you should just drop the errmsg and use the errdetail as primary
(with no-initcap and no-trailing-period, of course).  I don't see more
than two or three cases in this whole patch where I'd use an errdetail
at all; in almost all of them, the proposed errdetail looks perfectly
suitable to be a primary message.

One other generic gripe is that a lot of these messages use the term
"singleton", which seems a bit too jargon-y to me.  As far as I can
see in a quick look at the backend .po files, we have not up to now
used that term in *any* user-facing error message.  Nor does it appear
anywhere in our user-facing documentation, except for one place that
was itself inserted by the jsonpath patch:

    Arrays of size 1 are interchangeable with a singleton.

I don't think that's either obvious to a non-mathematician, or
even technically correct; maybe it'd be better as

    An array of size 1 is considered equal to its sole element.

Likewise, I think it'd be better to avoid "singleton" in the error
messages.  In some places you could perhaps use "single" instead.
In some you just don't need it at all, eg in
    Jsonpath array subscript is not a singleton numeric value.
you could just drop the word "singleton" and it'd be perfectly
correct, since a numeric is necessarily a single value.

Also, we do often use the term "scalar" to mean a non-composite
value; maybe that would work for this context, in places where
you do really need that meaning.

Sorry to be making you work so hard on this, but I think good
error messages are an important part of having a quality feature.
I do see a lot of improvements already compared to where we started.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Mon, Apr 29, 2019 at 6:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > [ jsonpath-errors-improve-3.patch ]
>
> This is getting better, but IMO it's still a bit too willing to use
> a boilerplate primary error message plus errdetail.  I do not think
> that is project style nor something to be encouraged.
>
> In particular, you've got a whole lot of cases like this:
>
> +                                      errmsg("JSON object is expected"),
> +                                      errdetail("Jsonpath member accessor can only be applied to an object."))));
>
> I think you should just drop the errmsg and use the errdetail as primary
> (with no-initcap and no-trailing-period, of course).  I don't see more
> than two or three cases in this whole patch where I'd use an errdetail
> at all; in almost all of them, the proposed errdetail looks perfectly
> suitable to be a primary message.

Ok, now it seems that I understood.  errdetail is removed from vast
majority of cases.

> One other generic gripe is that a lot of these messages use the term
> "singleton", which seems a bit too jargon-y to me.  As far as I can
> see in a quick look at the backend .po files, we have not up to now
> used that term in *any* user-facing error message.  Nor does it appear
> anywhere in our user-facing documentation, except for one place that
> was itself inserted by the jsonpath patch:
>
>     Arrays of size 1 are interchangeable with a singleton.
>
> I don't think that's either obvious to a non-mathematician, or
> even technically correct; maybe it'd be better as
>
>     An array of size 1 is considered equal to its sole element.
>
> Likewise, I think it'd be better to avoid "singleton" in the error
> messages.  In some places you could perhaps use "single" instead.
> In some you just don't need it at all, eg in
>     Jsonpath array subscript is not a singleton numeric value.
> you could just drop the word "singleton" and it'd be perfectly
> correct, since a numeric is necessarily a single value.
>
> Also, we do often use the term "scalar" to mean a non-composite
> value; maybe that would work for this context, in places where
> you do really need that meaning.

Makes sense for me.  "Singleton" word comes from the standard.  But
assuming we almost don't use it in the documentation (and especially
don't define it), it's better to get rid of this word altogether.
Removed from error messages.  Separate patch adjusting docs as you
proposed is also attached.

> Sorry to be making you work so hard on this, but I think good
> error messages are an important part of having a quality feature.
> I do see a lot of improvements already compared to where we started.

It's nothing to be sorry about.  I need to learn this in order to make
my further commits better.  Thank you for your explanations.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, Apr 30, 2019 at 1:20 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Mon, Apr 29, 2019 at 6:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > > [ jsonpath-errors-improve-3.patch ]
> >
> > This is getting better, but IMO it's still a bit too willing to use
> > a boilerplate primary error message plus errdetail.  I do not think
> > that is project style nor something to be encouraged.
> >
> > In particular, you've got a whole lot of cases like this:
> >
> > +                                      errmsg("JSON object is expected"),
> > +                                      errdetail("Jsonpath member accessor can only be applied to an object."))));
> >
> > I think you should just drop the errmsg and use the errdetail as primary
> > (with no-initcap and no-trailing-period, of course).  I don't see more
> > than two or three cases in this whole patch where I'd use an errdetail
> > at all; in almost all of them, the proposed errdetail looks perfectly
> > suitable to be a primary message.
>
> Ok, now it seems that I understood.  errdetail is removed from vast
> majority of cases.
>
> > One other generic gripe is that a lot of these messages use the term
> > "singleton", which seems a bit too jargon-y to me.  As far as I can
> > see in a quick look at the backend .po files, we have not up to now
> > used that term in *any* user-facing error message.  Nor does it appear
> > anywhere in our user-facing documentation, except for one place that
> > was itself inserted by the jsonpath patch:
> >
> >     Arrays of size 1 are interchangeable with a singleton.
> >
> > I don't think that's either obvious to a non-mathematician, or
> > even technically correct; maybe it'd be better as
> >
> >     An array of size 1 is considered equal to its sole element.
> >
> > Likewise, I think it'd be better to avoid "singleton" in the error
> > messages.  In some places you could perhaps use "single" instead.
> > In some you just don't need it at all, eg in
> >     Jsonpath array subscript is not a singleton numeric value.
> > you could just drop the word "singleton" and it'd be perfectly
> > correct, since a numeric is necessarily a single value.
> >
> > Also, we do often use the term "scalar" to mean a non-composite
> > value; maybe that would work for this context, in places where
> > you do really need that meaning.
>
> Makes sense for me.  "Singleton" word comes from the standard.  But
> assuming we almost don't use it in the documentation (and especially
> don't define it), it's better to get rid of this word altogether.
> Removed from error messages.  Separate patch adjusting docs as you
> proposed is also attached.
>
> > Sorry to be making you work so hard on this, but I think good
> > error messages are an important part of having a quality feature.
> > I do see a lot of improvements already compared to where we started.
>
> It's nothing to be sorry about.  I need to learn this in order to make
> my further commits better.  Thank you for your explanations.

Attached patchset contains revised commit messages.  I'm going to
commit this on no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Tom Lane
Date:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Attached patchset contains revised commit messages.  I'm going to
> commit this on no objections.

Sorry for slow response --- I was tied up with release preparations.

The -5 patches look pretty good.  A couple of nits:

@@ -774,9 +749,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
             {
                 RETURN_ERROR(ereport(ERROR,
                                      (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
-                                      errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
-                                      errdetail("jsonpath array accessor can "
-                                                "only be applied to an array"))));
+                                      errdetail("jsonpath array accessor can only be applied to an array"))));
             }
             break;

I think you forgot to s/errdetail/errmsg/ in this one.  Likewise at:

+                                      errdetail("jsonpath wildcard member accessor can only be applied to an
object"))));


Also, I saw two places where you missed removing a trailing period
from an errmsg:

+                              errmsg("left operand of jsonpath operator %s is not a single numeric value.",
+                                     jspOperationName(jsp->type)))));

+                              errmsg("right operand of jsonpath operator %s is not a single numeric value.",
+                                     jspOperationName(jsp->type)))));


I'd suggest making a quick pass for other instances of the same mistakes,
just in case.  I'm good with the wording on everything now.

            regards, tom lane



Re: jsonpath

From
Alexander Korotkov
Date:
On Tue, May 7, 2019 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> > Attached patchset contains revised commit messages.  I'm going to
> > commit this on no objections.
>
> Sorry for slow response --- I was tied up with release preparations.
>
> The -5 patches look pretty good.  A couple of nits:
>
> @@ -774,9 +749,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
>              {
>                  RETURN_ERROR(ereport(ERROR,
>                                       (errcode(ERRCODE_JSON_ARRAY_NOT_FOUND),
> -                                      errmsg(ERRMSG_JSON_ARRAY_NOT_FOUND),
> -                                      errdetail("jsonpath array accessor can "
> -                                                "only be applied to an array"))));
> +                                      errdetail("jsonpath array accessor can only be applied to an array"))));
>              }
>              break;
>
> I think you forgot to s/errdetail/errmsg/ in this one.  Likewise at:
>
> +                                      errdetail("jsonpath wildcard member accessor can only be applied to an
object"))));
>
>
> Also, I saw two places where you missed removing a trailing period
> from an errmsg:
>
> +                              errmsg("left operand of jsonpath operator %s is not a single numeric value.",
> +                                     jspOperationName(jsp->type)))));
>
> +                              errmsg("right operand of jsonpath operator %s is not a single numeric value.",
> +                                     jspOperationName(jsp->type)))));
>
>
> I'd suggest making a quick pass for other instances of the same mistakes,
> just in case.  I'm good with the wording on everything now.

Thank you!  I've catched couple other cases with errdetail() instead
of errmsg().  Pushed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: jsonpath

From
Alexander Korotkov
Date:
Couple patches improving jsonpath docs are attached.  The first one
documents nesting level filter for .** accessor.  The second adds to
documentation of jsonpath array subsciption usage of expressions and
multiple slice ranges.  I'm going to push both patches if no
objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: jsonpath

From
Alexander Korotkov
Date:
On Fri, May 17, 2019 at 6:50 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Couple patches improving jsonpath docs are attached.  The first one
> documents nesting level filter for .** accessor.  The second adds to
> documentation of jsonpath array subsciption usage of expressions and
> multiple slice ranges.  I'm going to push both patches if no
> objections.

Looking more on documentation I found that I'm not exactly happy on
how jsonpath description is organized.  Part of description including
accessors is given in json datatypes section [1], while part of
description including functions and operators is given in json
functions and operators section.  I think we should give the whole
jsonpath description in the single place.  So, what about moving it to
datatype section leaving functions sections with just SQL-level
functions?

1. https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH
2. https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company