Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size |
Date | |
Msg-id | CA+HiwqHJU5mC7wXfjoEJ=_Kdu_NotHzY4uLe57JEDrqysW_a8A@mail.gmail.com Whole thread Raw |
In response to | Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Hi Andres, On Sat, Aug 6, 2022 at 5:37 AM Andres Freund <andres@anarazel.de> wrote: > On 2022-08-04 17:01:48 +0900, Amit Langote wrote: > > On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote: > > > Honestly, this code seems like it should just be rewritten from scratch. > > > > Based on what I wrote above, please let me know if I've misunderstood > > your concerns about over-allocation of coercion state. I can try to > > rewrite one more time if I know what this should look like instead. > > I don't know. I don't understand the design of what needs to have error > handling, and what not. AFAIK, there are two things that the JSON path code considers can cause an error when evaluating a JsonExpr: * Actual JSON path evaluation in ExecEvalJsonExpr(), because it invokes JsonPath*() family of functions defined in jsonpath_exec.c, which can possibly cause an error. Actually, I looked again today as to what goes on in them and it seems there is a throwErrors variable being used to catch and prevent any errors found by the JSON path machinery itself, and it has the same value as the throwErrors variable in ExecEvalJsonExpr(). Given that the latter is set per the ON ERROR specification (throw errors or return NULL / a default expression in lieu), maybe this part doesn't really need a sub-transaction. To check, I took off the sub-transaction around this part and can see that no tests fail. * Evaluating coercion expression in ExecEvalJsonExprCoercion(), which involves passing a user-specified expression through either EEOP_IOCOERCE or json_populate_type(), both of which can cause errors that are not suppressible as those in jsonpath_exec.c are. So, this part does need a sub-transaction to satisfy the ON ERROR behavior. To check, I took out the sub-transaction around the coercion evaluation, and some tests in jsob_sqljson do indeed fail, like this, for example: SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING int); - json_value ------------- - -(1 row) - +ERROR: invalid input syntax for type integer: "aaa" Note that both the JSON path expression and the coercion would run as part of the one EEOP_JSONEXPR ExecEvalStep before this patch and thus would be wrapped under the same sub-transaction, even if only the latter apparently needs it. With this patch, I tried to keep that same behavior, but because the coercion evaluation has now been broken out into its own step, it must use another sub-transaction, given that the same sub-transaction can no longer wrap both things. But given my finding that the JSON path expression doesn't really need one, maybe the new EEOP_JSONEXPR_PATH step can run without one, while keeping it for the new EEOP_JSONEXPR_COERCION step. > > > > > I first was confused why the code tries to load the jump target > > > > > dynamically. But then I saw that the interpreted code sets it dynamically - > > > > > why? That's completely unnecessary overhead afaics? There's just two > > > > > possible jump targets, no? > > > > > > > > Hmm, I looked at the code for other expressions that jump, especially > > > > CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be > > > > added statically. I thought we can't use them in this case, because > > > > the conditions are very ad-hoc, like if the JSON path computation > > > > returned an "empty" item or if the "error" flag was set during that > > > > computation, etc. > > > > > > The minimal fix would be to return the jump target from the function, and then > > > jump to that. That at least avoids the roundtrip to memory you have right now. > > > > You mean like this: > > > > LLVMValueRef v_args[2]; > > LLVMValueRef v_ret; > > > > /* > > * Call ExecEvalJsonExprSkip() to decide if JSON path > > * evaluation can be skipped. This returns the step > > * address to jump to. > > */ > > v_args[0] = v_state; > > v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep)); > > v_ret = LLVMBuildCall(b, > > llvm_pg_func(mod, > > "ExecEvalJsonExprSkip"), > > params, lengthof(params), ""); > > > > Actually, this is how I had started, but never figured out how to jump > > to the address in v_ret. As in, how to extract the plain C int32 > > value that is the jump address from v_ret, an LLVMValueRef, to use in > > the following statement: > > > > LLVMBuildBr(b, opblocks[<int32-in-v_ret>]); > > We could make that work, but even keeping it more similar to your current > code, you're already dealing with a variable jump target. Only that you load > it from a variable, instead of the function return type. So you could just > v_ret instead of v_jumpaddr, and your code would be simpler and faster. Ah, I see you mean to continue to use all the LLVMBuildCondBr()s as the code currently does, but use v_ret like in the code above, instead of v_jumpaddr, to access the jump address returned by the step-choosing function. I've done that in the updated patch. This also allows us to get rid of all the jumpdone fields in the ExprEvalStep. I've also moved the blocks of code in ExecInitExprRec() that initialize the state for JsonExpr and JsonItemCoercions into new functions. I've also moved EvalJsonPathVar() from execExprInterp.c to jsonpath_exec.c where it's used. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: