Re: Extract numeric filed in JSONB more effectively - Mailing list pgsql-hackers
From | Chapman Flack |
---|---|
Subject | Re: Extract numeric filed in JSONB more effectively |
Date | |
Msg-id | 5955e93347a7e3b1612cf7e129ae6d04@anastigmatix.net Whole thread Raw |
In response to | Re: Extract numeric filed in JSONB more effectively (Chapman Flack <chap@anastigmatix.net>) |
Responses |
Re: Extract numeric filed in JSONB more effectively
|
List | pgsql-hackers |
On 2023-08-22 08:16, Chapman Flack wrote: > On 2023-08-22 01:54, Andy Fan wrote: >> After we label it, we will get error like this: >> >> select (a->'a')::int4 from m; >> ERROR: cannot display a value of type internal > > Without looking in depth right now, I would double-check > what relabel node is being applied at the result. The idea, > of course, was to relabel the result as the expected result as I suspected, looking at v10-0003, there's this: + fexpr = (FuncExpr *)makeRelabelType((Expr *) fexpr, INTERNALOID, + 0, InvalidOid, COERCE_IMPLICIT_CAST); compared to the example I had sent earlier: On 2023-08-18 17:02, Chapman Flack wrote: > expr = (Expr *)makeRelabelType((Expr *)fexpr, > targetOid, -1, InvalidOid, COERCE_IMPLICIT_CAST); The key difference: this is the label going on the result type of the function we are swapping in. The function already has return type declared internal; we want to relabel it as returning the type identified by targetOid. A relabel node to type internal is the reverse of what's needed (and also superfluous, as the function's return type is internal already). Two more minor differences: (1) the node you get from makeRelabelType is an Expr, but not really a FuncExpr. Casting it to FuncExpr is a bit bogus. Also, the third argument to makeRelabelType is a typmod, and I believe the "not-modified" typmod is -1, not 0. In the example I had sent earlier, there were two relabel nodes, serving different purposes. In one, we have a function that wants internal for an argument, but we've made a Const with type OIDOID, so we want to say "this is internal, the thing the function wants." The situation is reversed at the return type of the function: the function declares its return type internal, but the surrounding query is expecting an int4 (or whatever targetOid identifies), so any relabel node there needs to say it's an int4 (or whatever). So, that approach involves two relabelings. In the one idea for restructuring that I suggested earlier, that's reduced: the ..._int function produces a JsonbValue (typed internal) and the selected ..._finish function expects that, so those types match and no relabeling is called for. And with the correct ..._finish function selected at rewrite time, it already has the same return type the surrounding query expects, so no relabeling is called for there either. However, simply to ensure the ..._int function cannot be casually called, it needs an internal parameter, even an unused one, and the rewriter must supply a value for that, which may call for one relabel node. On 2023-08-21 06:50, Andy Fan wrote: > I'm not very excited with this manner, reasons are: a). It will have > to emit more steps in ExprState->steps which will be harmful for > execution. The overhead is something I'm not willing to afford. I would be open to a performance comparison, but offhand I am not sure whether the overhead of another step or two in an ExprState is appreciably more than some of the overhead in the present patch, such as the every-time-through fcinfo initialization buried in DirectFunctionCall1 where you don't necessarily see it. I bet the fcinfo in an ExprState step gets set up once, and just has new argument values slammed into it each time through. (Also, I know very little about how the JIT compiler is used in PG, but I suspect that a step you bury inside your function is a step it may not get to see.) > b). this manner requires more *internal*, which is kind of similar > to "void *" in C. I'm not sure in what sense you mean "more". The present patch has to deal with two places where some other type must be relabeled internal or something internal relabeled to another type. The approach I suggested does involve two families of function, one returning internal (a JsonbValue) and one expecting internal (a JsonbValue), where the rewriter would compose one over the other, no relabeling needed. There's also an internal parameter needed for whatever returns internal, and that's just the protocol for how such things are done. To me, that seems like a fairly principled use of the type. I would not underestimate the benefit of reducing the code duplication and keeping the patch as clear as possible. The key contributions of the patch are getting a numeric or boolean efficiently out of the JSON operation. Getting from numeric to int or float are things the system already does well. A patch that focuses on what it contributes, and avoids redoing things the system already can do--unless the duplication can be shown to have a strong performance benefit--is easier to review and probably to get integrated. Regards, -Chap
pgsql-hackers by date: