Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size - Mailing list pgsql-hackers

From Amit Langote
Subject Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
Date
Msg-id CA+HiwqHJU5mC7wXfjoEJ=_Kdu_NotHzY4uLe57JEDrqysW_a8A@mail.gmail.com
Whole thread Raw
In response to Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi Andres,

On Sat, Aug 6, 2022 at 5:37 AM Andres Freund <andres@anarazel.de> wrote:
> On 2022-08-04 17:01:48 +0900, Amit Langote wrote:
> > On Wed, Aug 3, 2022 at 12:00 AM Andres Freund <andres@anarazel.de> wrote:
> > > Honestly, this code seems like it should just be rewritten from scratch.
> >
> > Based on what I wrote above, please let me know if I've misunderstood
> > your concerns about over-allocation of coercion state.  I can try to
> > rewrite one more time if I know what this should look like instead.
>
> I don't know. I don't understand the design of what needs to have error
> handling, and what not.

AFAIK, there are two things that the JSON path code considers can
cause an error when evaluating a JsonExpr:

* Actual JSON path evaluation in ExecEvalJsonExpr(), because it
invokes JsonPath*() family of functions defined in jsonpath_exec.c,
which can possibly cause an error.  Actually, I looked again today as
to what goes on in them and it seems there is a throwErrors variable
being used to catch and prevent any errors found by the JSON path
machinery itself, and it has the same value as the throwErrors
variable in ExecEvalJsonExpr().  Given that the latter is set per the
ON ERROR specification (throw errors or return NULL / a default
expression in lieu), maybe this part doesn't really need a
sub-transaction.  To check, I took off the sub-transaction around this
part and can see that no tests fail.

* Evaluating coercion expression in ExecEvalJsonExprCoercion(), which
involves passing a user-specified expression through either
EEOP_IOCOERCE or json_populate_type(), both of which can cause errors
that are not suppressible as those in jsonpath_exec.c are.  So, this
part does need a sub-transaction to satisfy the ON ERROR behavior.  To
check, I took out the sub-transaction around the coercion evaluation,
and some tests in jsob_sqljson do indeed fail, like this, for example:

 SELECT JSON_VALUE(jsonb '"aaa"', '$' RETURNING int);
- json_value
-------------
-
-(1 row)
-
+ERROR:  invalid input syntax for type integer: "aaa"

Note that both the JSON path expression and the coercion would run as
part of the one EEOP_JSONEXPR ExecEvalStep before this patch and thus
would be wrapped under the same sub-transaction, even if only the
latter apparently needs it.

With this patch, I tried to keep that same behavior, but because the
coercion evaluation has now been broken out into its own step, it must
use another sub-transaction, given that the same sub-transaction can
no longer wrap both things.  But given my finding that the JSON path
expression doesn't really need one, maybe the new EEOP_JSONEXPR_PATH
step can run without one, while keeping it for the new
EEOP_JSONEXPR_COERCION step.

> > > > >   I first was confused why the code tries to load the jump target
> > > > >   dynamically. But then I saw that the interpreted code sets it dynamically -
> > > > >   why? That's completely unnecessary overhead afaics? There's just two
> > > > >   possible jump targets, no?
> > > >
> > > > Hmm, I looked at the code for other expressions that jump, especially
> > > > CASE WHEN, but they use ready-made EEOP_JUMP_IF_* steps, which can be
> > > > added statically.  I thought we can't use them in this case, because
> > > > the conditions are very ad-hoc, like if the JSON path computation
> > > > returned an "empty" item or if the "error" flag was set during that
> > > > computation, etc.
> > >
> > > The minimal fix would be to return the jump target from the function, and then
> > > jump to that. That at least avoids the roundtrip to memory you have right now.
> >
> > You mean like this:
> >
> >                     LLVMValueRef v_args[2];
> >                     LLVMValueRef v_ret;
> >
> >                     /*
> >                      * Call ExecEvalJsonExprSkip() to decide if JSON path
> >                      * evaluation can be skipped.  This returns the step
> >                      * address to jump to.
> >                      */
> >                     v_args[0] = v_state;
> >                     v_args[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
> >                     v_ret = LLVMBuildCall(b,
> >                                           llvm_pg_func(mod,
> > "ExecEvalJsonExprSkip"),
> >                                           params, lengthof(params), "");
> >
> > Actually, this is how I had started, but never figured out how to jump
> > to the address in v_ret.  As in, how to extract the plain C int32
> > value that is the jump address from v_ret, an LLVMValueRef, to use in
> > the following statement:
> >
> >                     LLVMBuildBr(b, opblocks[<int32-in-v_ret>]);
>
> We could make that work, but even keeping it more similar to your current
> code, you're already dealing with a variable jump target. Only that you load
> it from a variable, instead of the function return type. So you could just
> v_ret instead of v_jumpaddr, and your code would be simpler and faster.

Ah, I see you mean to continue to use all the LLVMBuildCondBr()s as
the code currently does, but use v_ret like in the code above, instead
of v_jumpaddr, to access the jump address returned by the
step-choosing function.  I've done that in the updated patch.  This
also allows us to get rid of all the jumpdone fields in the
ExprEvalStep.

I've also moved the blocks of code in ExecInitExprRec() that
initialize the state for JsonExpr and JsonItemCoercions into new
functions.  I've also moved EvalJsonPathVar() from execExprInterp.c to
jsonpath_exec.c where it's used.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: optimize lookups in snapshot [sub]xip arrays
Next
From: r.zharkov@postgrespro.ru
Date:
Subject: Re: Checking pgwin32_is_junction() errors