Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size |
Date | |
Msg-id | 20220805203656.mdgmajvsahi5srcy@awork3.anarazel.de Whole thread Raw |
In response to | Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size |
List | pgsql-hackers |
Hi, I tried to look into some of the questions from Amit, but I have e.g. no idea what exactly the use of subtransactions tries to achieve - afaics 1a36bc9dba8 is the first patch to introduce needing to evaluate parts expressions in a subtransaction - but there's not a single comment explaining why. It's really hard to understand the new json code. It's a substantial amount of new infrastructure, without any design documentation that I can find. And it's not like it's just code that's equivalent to nearby stuff. To me this falls way below our standards and I think it's *months* of work to fix. Even just the expresion evaluation code: EvalJsonPathVar(), ExecEvalJsonConstructor(), ExecEvalJsonExpr(), ExecEvalJson(). There's one layer of subtransactions in one of the paths in ExecEvalJsonExpr(), another in ExecEvalJson(). Some paths of ExecEvalJsonExpr() go through subtransactions, others don't. It's one thing for a small set of changes to be of this quality. But this is pretty large - a quick summing of diffstat ends up with about 17k lines added, of which ~2.5k are docs, ~4.8k are tests. On 2022-08-04 17:01:48 +0900, Amit Langote wrote: > On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote: > > > > The whole coercion stuff just seems incredibly clunky (in a slightly > > > > different shape before this patch). ExecEvalJsonExprItemCoercion() calls > > > > ExecPrepareJsonItemCoercion(), which gets a pointer to one of the per-type > > > > elements in JsonItemCoercionsState, dispatching on the type of the json > > > > object. Then we later call ExecGetJsonItemCoercion() (via a convoluted > > > > path), which again will dispatch on the type (extracting the json object > > > > again afaics!), to then somehow eventually get the coerced value. > > > > > > I think it might be possible to make this a bit simpler, by not > > > leaving anything coercion-related in ExecEvalJsonExpr(). > > > > Honestly, this code seems like it should just be rewritten from scratch. > > Based on what I wrote above, please let me know if I've misunderstood > your concerns about over-allocation of coercion state. I can try to > rewrite one more time if I know what this should look like instead. I don't know. I don't understand the design of what needs to have error handling, and what not. > > > > I first was confused why the code tries to load the jump target > > > > dynamically. But then I saw that the interpreted code sets it dynamically - > > > > why? That's completely unnecessary overhead afaics? There's just two > > > > possible jump targets, no? > > > > > > Hmm, I looked at the code for other expressions that jump, especially > > > CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be > > > added statically. I thought we can't use them in this case, because > > > the conditions are very ad-hoc, like if the JSON path computation > > > returned an "empty" item or if the "error" flag was set during that > > > computation, etc. > > > > The minimal fix would be to return the jump target from the function, and then > > jump to that. That at least avoids the roundtrip to memory you have right now. > > You mean like this: > > LLVMValueRef v_args[2]; > LLVMValueRef v_ret; > > /* > * Call ExecEvalJsonExprSkip() to decide if JSON path > * evaluation can be skipped. This returns the step > * address to jump to. > */ > v_args[0] = v_state; > v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep)); > v_ret = LLVMBuildCall(b, > llvm_pg_func(mod, > "ExecEvalJsonExprSkip"), > params, lengthof(params), ""); > > Actually, this is how I had started, but never figured out how to jump > to the address in v_ret. As in, how to extract the plain C int32 > value that is the jump address from v_ret, an LLVMValueRef, to use in > the following statement: > > LLVMBuildBr(b, opblocks[<int32-in-v_ret>]); We could make that work, but even keeping it more similar to your current code, you're already dealing with a variable jump target. Only that you load it from a variable, instead of the function return type. So you could just v_ret instead of v_jumpaddr, and your code would be simpler and faster. Greetings, Andres Freund
pgsql-hackers by date: