Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers

From Haibo Yan
Subject Re: Extract numeric filed in JSONB more effectively
Date
Msg-id CABXr29G35c6Qo-Ab4EpH24gHS+ejjJGfhmW3nQ1eEK+Rgt_9Wg@mail.gmail.com
Whole thread
In response to Re: Extract numeric filed in JSONB more effectively  (Haibo Yan <tristan.yim@gmail.com>)
List pgsql-hackers
On Wed, Apr 8, 2026 at 12:50 PM Haibo Yan <tristan.yim@gmail.com> wrote:
On Tue, Apr 7, 2026 at 6:21 PM Haibo Yan <tristan.yim@gmail.com> wrote:
On Tue, Apr 7, 2026 at 5:00 PM Andy Fan <zhihuifan1213@163.com> wrote:
Haibo Yan <tristan.yim@gmail.com> writes:

Hi Haibo,

> I agree that if this approach is extended to the full matrix naively,
> duplication will become a real issue.

Could you summary how it would be? I think it would be helpful for
others to review.  Otherwise every reviewer needs to count them many
times.

--
Best Regards
Andy Fan
Hi Andy,
Sure.

My current thought is to extend it in stages, rather than trying to solve the full matrix in a single patch.

A rough plan would be:

1. Keep the current stage-1 patch small and validate the basic approach first

  • jsonb_object_field / -> / equivalent subscripting form

  • casts to numeric and bool

  • support-function rewrite directly to explicit typed extractor functions

2. Extend target types before extending extractor families

  • add int4 / int8 / float8 for the same object-field family first

  • keep the SQL-visible rewrite targets explicit, e.g.

    • jsonb_object_field_int4

    • jsonb_object_field_int8

    • jsonb_object_field_float8

  • avoid the previous numeric-intermediate rewrite shape

3. Then extend to other extractor families with the same overall pattern

  • likely starting with jsonb_array_element and jsonb_extract_path

  • and possibly jsonb_path_query_first later

  • each family would still rewrite to explicit typed extractor entry points, e.g.

    • jsonb_array_element_numeric

    • jsonb_extract_path_bool

    • jsonb_path_query_first_int4

4. Keep duplication manageable by sharing the implementation underneath

  • keep the SQL/catalog-level rewrite targets explicit for readability and reviewability

  • but factor the C implementation into:

    • extractor-family lookup helpers

    • target-type conversion helpers

    • thin wrappers, possibly generated with small macros

So the idea would be: explicit rewrite targets at the SQL/catalog level, but shared lookup/conversion code underneath, instead of going back to the earlier start/finish/internal pipeline.

I agree that if this is extended naively across the full matrix, duplication will become a real issue. My reason for keeping the current patch narrow is that I wanted to first validate this simpler rewrite shape on a small subset before deciding how best to scale it further.

Regards,

Haibo

Hi all,

Following up on our previous discussion, I want to clarify the current patch plan together with the updated first patch.

Earlier I described this work roughly as a 4-patch line. After iterating on the implementation and trying to keep each step reviewable, I now think the cleaner split is a 5-patch series:

  1. object-field casts to scalar types

  2. array-element casts to scalar types

  3. extract-path casts to scalar types

  4. multi-subscript casts via extract-path lowering

  5. jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)

The overall design is unchanged: use the cast function’s support hook to recognize cast(extract(...)) over scalar-returning jsonb extraction functions, and rewrite that directly to explicit typed extractor calls.

Supported target types remain:

  • numeric
  • bool
  • int4
  • int8
  • float8

One point I also want to make explicit is that I do not plan to include jsonb_path_query in this series.

After looking at it more carefully, I do not think it fits the same model as the rest of the series. The patches here are all about scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair. jsonb_path_query is set-returning, so optimizing casts over it would likely need a different mechanism, probably at planner or executor level, rather than one more patch in this prosupport-based series.

Attached here is the updated first patch in the current plan.

This patch covers object-field extraction only:

  • jsonb_object_field
  • -> with text key
  • key subscripting

and rewrites casts to:

  • numeric
  • bool
  • int4
  • int8
  • float8

to direct typed extractor calls.

Thanks again for the earlier comments. I plan to send the remaining patches in follow-up emails in the order above.

Regards,

Haibo 

Hi all,
Following up on the earlier thread, I am continuing with the same overall plan and sending the next two patches in the series together.
The design is still the same as before: use the cast function’s support hook to recognize cast(extract(...)) over scalar-returning jsonb extraction functions, and rewrite that directly to explicit typed extractor calls, without changing normal SQL syntax.
At this point, I have also folded int2 and float4 into the per-family patches, so each patch now carries the full target-type coverage for the functionality it introduces.
The supported target types are now:
numeric
  • bool
  • int2
  • int4
  • int8
  • float4
  • float8
I am still keeping jsonb_path_query out of scope for this series. The series is focused on scalar-returning extraction functions, where the cast prosupport hook can see and rewrite a scalar expression pair. jsonb_path_query is set-returning, so optimizing casts over it looks like a different planner/executor problem rather than one more patch in this prosupport-based line.

With that in mind, the current 5-patch plan is:
  1. object-field casts to scalar types
  2. array-element casts to scalar types
  3. extract-path casts to scalar types
  4. multi-subscript casts via extract-path lowering
  5. jsonpath-first casts to scalar types (jsonb_path_query_first and _tz)
In this email I am sending the first two patches:
  • patch 1: object-field casts to scalar types
  • patch 2: array-element casts to scalar types
Patch 1 covers:
  • jsonb_object_field
  • -> with text key
  • key subscripting
Patch 2 covers:
  • jsonb_array_element
  • -> with integer RHS
  • single-index array subscripting
Both patches now support the full target-type set listed above.
Thanks again for the earlier comments. I plan to continue with the remaining patches in follow-up emails in the same order.

Regards,
Haibo

Attachment

pgsql-hackers by date:

Previous
From: Greg Lamberson
Date:
Subject: Extensible sync handler registration (register_sync_handler)
Next
From: SATYANARAYANA NARLAPURAM
Date:
Subject: Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column