Hello hackers,
This is a follow-up to the work recently merged in bd4f879.
In hindsight, I regret not pushing these through for the previous cycle,
as they represent the "missing pieces" for users trying to perform data cleaning entirely within the JSONPath engine.
With these we can significantly reduce the need for users to "drop out" of JSONPath
into standard SQL for basic string-to-string-or-array-and-back workflows.
select jsonb_path_query('" A,b,C "', '$.btrim().lower().split(",").join("-").replace("a","x").upper() starts with "X-B"');
jsonb_path_query
------------------
true
(1 row)
This patch series adds three new methods to the jsonpath engine:
$.translate(from, to)
A straightforward wrapper around the standard translate() function.
It handles character-by-character mapping and is a natural companion to the recently merged .replace().
$.split(delimiter [, null_string])
A wrapper around string_to_array().
While we already have .split_part(), that only returns a single token.
.split() allows for a full "explosion" of a string into a JSON array.
$.join(delimiter [, null_string])
The inverse of .split(), wrapping array_to_string().
The input must be an array of strings or nulls.
No implicit casting of numbers or booleans is attempted. This is consistent with how
other jsonpath string methods handle type mismatches, and can always
be relaxed in a follow-up if there's appetite for it.
A Note on Lax vs. Strict Semantics:
In this implementation, I have kept .join() behavior consistent between lax and strict modes regarding type mismatches (i.e., both will currently error on non-string elements).
While lax mode traditionally handles auto-unwrapping of sequences, .join() is unique
in that it operates on the array as a collective unit rather than iterating through it to produce multiple results.
I've left the behavior as "strict-equivalent" for now to remain conservative,
but I am open to discussion on whether lax should instead skip non-string elements or attempt to "auto-wrap" scalars into single-element arrays.
The .split() and .join() methods introduce a shift in how we handle item methods.
Historically, most string methods in our engine are scalar-to-scalar,
with keyvalue() being the only exception so far.