Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers
| From | Haibo Yan |
|---|---|
| Subject | Re: Extract numeric filed in JSONB more effectively |
| Date | |
| Msg-id | CABXr29ETPyi1Tt+X29oUa6Q9AmT3j7RZSdsp12H_6EzGw7RqnQ@mail.gmail.com Whole thread |
| In response to | Re: Extract numeric filed in JSONB more effectively (Haibo Yan <tristan.yim@gmail.com>) |
| Responses |
Re: Extract numeric filed in JSONB more effectively
|
| List | pgsql-hackers |
On Tue, Apr 7, 2026 at 5:00 PM Andy Fan <zhihuifan1213@163.com> wrote:Haibo Yan <tristan.yim@gmail.com> writes:Hi Andy,
Hi Haibo,
> I agree that if this approach is extended to the full matrix naively,
> duplication will become a real issue.
Could you summary how it would be? I think it would be helpful for
others to review. Otherwise every reviewer needs to count them many
times.
--
Best Regards
Andy Fan
Sure.My current thought is to extend it in stages, rather than trying to solve the full matrix in a single patch.
A rough plan would be:
1. Keep the current stage-1 patch small and validate the basic approach first
2. Extend target types before extending extractor families
jsonb_object_field / -> / equivalent subscripting form
casts to numeric and bool
support-function rewrite directly to explicit typed extractor functions
3. Then extend to other extractor families with the same overall pattern
add int4 / int8 / float8 for the same object-field family first
keep the SQL-visible rewrite targets explicit, e.g.
jsonb_object_field_int4
jsonb_object_field_int8
jsonb_object_field_float8
avoid the previous numeric-intermediate rewrite shape
likely starting with jsonb_array_element and jsonb_extract_path
and possibly jsonb_path_query_first later
each family would still rewrite to explicit typed extractor entry points, e.g.
jsonb_array_element_numeric
jsonb_extract_path_bool
jsonb_path_query_first_int4
4. Keep duplication manageable by sharing the implementation underneath
keep the SQL/catalog-level rewrite targets explicit for readability and reviewability
but factor the C implementation into:
extractor-family lookup helpers
target-type conversion helpers
thin wrappers, possibly generated with small macros
So the idea would be: explicit rewrite targets at the SQL/catalog level, but shared lookup/conversion code underneath, instead of going back to the earlier start/finish/internal pipeline.
I agree that if this is extended naively across the full matrix, duplication will become a real issue. My reason for keeping the current patch narrow is that I wanted to first validate this simpler rewrite shape on a small subset before deciding how best to scale it further.
Regards,
Haibo
Hi all,
Following up on our previous discussion, I want to clarify the current patch plan together with the updated first patch.
Earlier I described this work roughly as a 4-patch line. After iterating on the implementation and trying to keep each step reviewable, I now think the cleaner split is a 5-patch series:
object-field casts to scalar types
array-element casts to scalar types
extract-path casts to scalar types
multi-subscript casts via extract-path lowering
jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)
The overall design is unchanged: use the cast function’s support hook to recognize cast(extract(...)) over scalar-returning jsonb extraction functions, and rewrite that directly to explicit typed extractor calls.
Supported target types remain:
- numeric
- bool
- int4
- int8
- float8
One point I also want to make explicit is that I do not plan to include jsonb_path_query in this series.
After looking at it more carefully, I do not think it fits the same model as the rest of the series. The patches here are all about scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair. jsonb_path_query is set-returning, so optimizing casts over it would likely need a different mechanism, probably at planner or executor level, rather than one more patch in this prosupport-based series.
Attached here is the updated first patch in the current plan.
This patch covers object-field extraction only:
- jsonb_object_field
- -> with text key
- key subscripting
and rewrites casts to:
- numeric
- bool
- int4
- int8
- float8
to direct typed extractor calls.
Thanks again for the earlier comments. I plan to send the remaining patches in follow-up emails in the order above.
Regards,
Attachment
pgsql-hackers by date: