Re: [PATCH] WIP: replace method for jsonpath - Mailing list pgsql-hackers

From Florents Tselai
Subject Re: [PATCH] WIP: replace method for jsonpath
Date
Msg-id CA+v5N4288cOvp0J3za5gRwqmw3PH18wphhve1M-i3orRzUyDtg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] WIP: replace method for jsonpath  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers


On 18 Sep 2024, at 3:47 PM, Andrew Dunstan <andrew@dunslane.net> wrote:


On 2024-09-18 We 4:23 AM, Peter Eisentraut wrote:
On 17.09.24 21:16, David E. Wheeler wrote:
On Sep 17, 2024, at 15:03, Florents Tselai <florents.tselai@gmail.com> wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1], which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from JSON path?  The proposed patch routes the example '$.replace("hello","bye")' internally to the internal implementation of the SQL function replace(..., 'hello', 'bye'). Why can't we do this automatically for any function call in a JSON path expression?




That might work. The thing that bothers me about the original proposal is this: what if we add a new non-spec jsonpath method and then a new version of the spec adds a method with the same name, but not compatible with our method? We'll be in a nasty place. At the very least I think we need to try hard to avoid that. Maybe we should prefix non-spec method names with "pg_", or maybe use an initial capital letter.

If naming is your main reservation, then I take it you’re generally positive.

Having said that, “pg_” is probably too long for a jsonpath expression, 
Most importantly though, “pg_” in my mind is a prefix for things like catalog lookup and system monitoring.
Not a function that the average user would use. 
Thus, I lean towards initial-capital.

The more general case would look like:
A new jsonpath item of the format $.Func(arg1, …, argn) can be applied (recursively or not) to a json object.

As a first iteration/version only pg_proc-registered functions of the format func(text, ...,) -> text are available.
We can focus on oid(arg0) = TEXTOID and rettype = TEXTOID fist.
The first arg0 has to be TEXTOID (the current json string) while subsequent args are provided by the user
in the jsonpath expression.

The functions we want to support will be part of jsonpath grammar
and during execution we'll have enough info to find the appropriate PGFunction to call.

What I'm missing yet is how we could handle vars jsonb,
in case the user doesn't want to just hardcode the actual function arguments.
Then resolving argtypes1...n is a bit more complex:

The signature for jsonb_apply(doc jsonb, func text[, variadic "any" args1_n]); [0]
Here, variadic "any" works beautifully, but that's a brand-new function.

In existing jsonb{path} facilities, vars are jsonb objects which could work as well I think.
Unless I'm missing something.





cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Wrong results with equality search using trigram index and non-deterministic collation
Next
From: a.imamov@postgrespro.ru
Date:
Subject: Re: Custom connstr in background_psql()