Home > mailing lists

Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers

From	Haibo Yan
Subject	Re: Extract numeric filed in JSONB more effectively
Date	April 7 11:31:16
Msg-id	CABXr29GhnWxrGXD2YijYSDxwObGLqOsf8gvfzrFu=UcYU71XYA@mail.gmail.com Whole thread
In response to	Re: Extract numeric filed in JSONB more effectively (Andy Fan <zhihuifan1213@163.com>)
List	pgsql-hackers

Tree view

On Mon, Apr 6, 2026 at 11:14 PM Andy Fan <zhihuifan1213@163.com> wrote:

Hi Haibo,

> I’d like to continue pushing this patch forward.
> Based on the earlier discussion, I reworked the patch into a smaller
> stage-1 version with a narrower scope and a simpler rewrite
> strategy. ..
> and does not yet try to cover array/path extraction or integer/float
> typed extractors.

Thanks for working on this. I did a quick comparison between this
version and my last patch v18 [1], here is the difference.

My previous v18[1]: 542 insertions(+), 59 deletions(-)
Your patch: 543 insertions(+), 5 deletions(-)

However as what you have realized, your current patch still lack of many
optimizations, Not only the integer/float stuff, but also many
functions, e.g. jsonb_array_element, jsonb_extract_path,
jsonb_path_query and jsonb_path_query_first. After considering this,
what patch will look like in your approach? I guess you can see many
duplications. Less code doesn't always mean better, but I think this
still be a key consideration to address/check.

Yes, I agree the current patch is still missing a large part of the full matrix from v18 — not only int4/int8/float8, but also other extractor families such as jsonb_array_element, jsonb_extract_path, jsonb_path_query, and jsonb_path_query_first.

My intent with this version was not to cover the whole space at once, but to first reduce it to a smaller stage-1 subset that is easier to review and easier to reason about. In particular, I wanted to validate the simpler rewrite shape first: keep the support-function-based approach, but rewrite directly to explicit typed extractor functions, without the previous start/finish/internal pipeline.

I agree that if this approach is extended to the full matrix naively, duplication will become a real issue. So I am not claiming that the current patch shape should simply be copied mechanically across every extractor/type combination. My thought was to first see whether this narrower subset is acceptable in principle, and then decide how best to extend it without reintroducing too much abstraction or too much duplication.

So for now I would prefer to keep the current scope intentionally narrow:

jsonb_object_field / -> / equivalent subscripting form
casts to numeric and bool

and treat the rest as follow-up work, rather than trying to solve the entire matrix in one patch.

> I also ran a small microbenchmark to isolate the cast-over-object-field path. On my setup, the current patch shows the following gains:
> Query Before After Speedup
> --------------------------------------------------------------------------------
> SELECT sum((j->'n')::numeric) FROM t 118.028 ms 56.082 ms 2.10x
> SELECT count(*) FROM t WHERE (j->'b')::bool 115.665 ms 51.945 ms 2.23x
> --------------------------------------------------------------------------------

Thanks for running the test.

[1] https://www.postgresql.org/message-id/87ttk0lgcx.fsf%40163.com

--
Best Regards
Andy Fan

Thanks,

Haibo

pgsql-hackers by date:

From: Ilya Cherdakov
Date: 07 April, 11:31:03
Subject: Re: Environment variable to disable diffs file output

From: Alvaro Herrera
Date: 07 April, 11:40:25
Subject: Re: Adding REPACK [concurrently]

Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers

Previous

Next