Thread: [HACKERS] Minor codegen silliness in ExecInterpExpr()
I noticed the following while poking around with perf: | fcinfo->isnull = false; |b5b: movb $0x0,0x1c(%rdx) | *op->resvalue = op->d.func.fn_addr(fcinfo); 0.02 | mov 0x8(%rbx),%rcx 1.19 | mov %rdx,%rdi 0.93| mov %rdx,(%rsp) | mov %rcx,0x8(%rsp) 0.01 | callq *0x28(%rbx) 2.17 | mov 0x8(%rsp),%rcx | mov %rax,(%rcx) | *op->resnull = fcinfo->isnull; 1.18 | mov (%rsp),%rdx 4.32 | mov 0x10(%rbx),%rax 0.06 | movzbl 0x1c(%rdx),%edx 9.14 | mov %dl,(%rax) It looks to me like gcc believes it is required to evaluate "op->resvalue" before invoking the called function, just in case the function somehow has access to *op and modifies that. We could save a pointless register spill and reload if there were a temporary variable in there, ie EEO_CASE(EEOP_FUNCEXPR) { FunctionCallInfo fcinfo = op->d.func.fcinfo_data; + Datum fvalue; fcinfo->isnull = false; - *op->resvalue = op->d.func.fn_addr(fcinfo); + fvalue = op->d.func.fn_addr(fcinfo); + *op->resvalue = fvalue; *op->resnull = fcinfo->isnull; EEO_NEXT(); } and likewise in the other FUNCEXPR cases. This is on a rather old gcc, haven't checked on bleeding-edge versions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hi, On 2017-09-28 16:21:34 -0400, Tom Lane wrote: > I noticed the following while poking around with perf: > > | fcinfo->isnull = false; > |b5b: movb $0x0,0x1c(%rdx) > | *op->resvalue = op->d.func.fn_addr(fcinfo); > 0.02 | mov 0x8(%rbx),%rcx > 1.19 | mov %rdx,%rdi > 0.93 | mov %rdx,(%rsp) > | mov %rcx,0x8(%rsp) > 0.01 | callq *0x28(%rbx) > 2.17 | mov 0x8(%rsp),%rcx > | mov %rax,(%rcx) > | *op->resnull = fcinfo->isnull; > 1.18 | mov (%rsp),%rdx > 4.32 | mov 0x10(%rbx),%rax > 0.06 | movzbl 0x1c(%rdx),%edx > 9.14 | mov %dl,(%rax) > > It looks to me like gcc believes it is required to evaluate "op->resvalue" > before invoking the called function, just in case the function somehow has > access to *op and modifies that. Yea, the compiler probably doesn't have much choice besides assuming that. Might be different if we'd annote function pointers as pure, and used strict aliasing + perhaps some restrict markers, but ... > We could save a pointless register spill > and reload if there were a temporary variable in there, ie > > EEO_CASE(EEOP_FUNCEXPR) > { > FunctionCallInfo fcinfo = op->d.func.fcinfo_data; > + Datum fvalue; > > fcinfo->isnull = false; > - *op->resvalue = op->d.func.fn_addr(fcinfo); > + fvalue = op->d.func.fn_addr(fcinfo); > + *op->resvalue = fvalue; > *op->resnull = fcinfo->isnull; > > EEO_NEXT(); > } > > and likewise in the other FUNCEXPR cases. > > This is on a rather old gcc, haven't checked on bleeding-edge versions. Makes sense. Do you want to make it so, or shall I? I'd personally be somewhat tempted to keep the branches in sync here... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes: > On 2017-09-28 16:21:34 -0400, Tom Lane wrote: >> We could save a pointless register spill >> and reload if there were a temporary variable in there, > Makes sense. Do you want to make it so, or shall I? I just finished testing a patch, as attached. On my machine (again, not latest gcc: 4.4.7 on RHEL6 x86_64), it reduces the code size of execExprInterp.o by a fraction of a percent, and it seems to offer a slight benefit in "pgbench -S" performance although I'd not put much stock in that being reproducible. > I'd personally be > somewhat tempted to keep the branches in sync here... I was tempted that way too, but it doesn't apply cleanly to v10, because of Peter's recent cleanup of function pointer invocation style. I don't think it's really worth worrying about. regards, tom lane diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index 09abd46..68a1f96 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -501,15 +501,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_CASE(EEOP_INNER_SYSVAR) { int attnum = op->d.var.attnum; + Datum d; /* these asserts must match defenses in slot_getattr */ Assert(innerslot->tts_tuple != NULL); Assert(innerslot->tts_tuple != &(innerslot->tts_minhdr)); - /* heap_getsysattr has sufficient defenses against bad attnums */ - *op->resvalue = heap_getsysattr(innerslot->tts_tuple, attnum, - innerslot->tts_tupleDescriptor, - op->resnull); + /* heap_getsysattr has sufficient defenses against bad attnums */ + d = heap_getsysattr(innerslot->tts_tuple, attnum, + innerslot->tts_tupleDescriptor, + op->resnull); + *op->resvalue = d; EEO_NEXT(); } @@ -517,15 +519,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_CASE(EEOP_OUTER_SYSVAR) { int attnum = op->d.var.attnum; + Datum d; /* these asserts must match defenses in slot_getattr */ Assert(outerslot->tts_tuple != NULL); Assert(outerslot->tts_tuple != &(outerslot->tts_minhdr)); /* heap_getsysattr has sufficient defenses against bad attnums */ - *op->resvalue = heap_getsysattr(outerslot->tts_tuple, attnum, - outerslot->tts_tupleDescriptor, - op->resnull); + d = heap_getsysattr(outerslot->tts_tuple, attnum, + outerslot->tts_tupleDescriptor, + op->resnull); + *op->resvalue = d; EEO_NEXT(); } @@ -533,15 +537,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_CASE(EEOP_SCAN_SYSVAR) { int attnum = op->d.var.attnum; + Datum d; /* these asserts must match defenses in slot_getattr */ Assert(scanslot->tts_tuple != NULL); Assert(scanslot->tts_tuple != &(scanslot->tts_minhdr)); - /* heap_getsysattr has sufficient defenses against bad attnums */ - *op->resvalue = heap_getsysattr(scanslot->tts_tuple, attnum, - scanslot->tts_tupleDescriptor, - op->resnull); + /* heap_getsysattr has sufficient defenses against bad attnums */ + d = heap_getsysattr(scanslot->tts_tuple, attnum, + scanslot->tts_tupleDescriptor, + op->resnull); + *op->resvalue = d; EEO_NEXT(); } @@ -641,13 +647,22 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) * As both STRICT checks and function-usage are noticeable performance * wise, and function calls are a very hot-path (they also back * operators!), it's worth having so many separate opcodes. + * + * Note: the reason for using a temporary variable "d", here and in + * other places, is that some compilers think "*op->resvalue = f();" + * requires them to evaluate op->resvalue into a register before + * calling f(), just in case f() is able to modify op->resvalue + * somehow. The extra line of code can save a useless register spill + * and reload, on architectures without many registers. */ EEO_CASE(EEOP_FUNCEXPR) { FunctionCallInfo fcinfo = op->d.func.fcinfo_data; + Datum d; fcinfo->isnull = false; - *op->resvalue = op->d.func.fn_addr(fcinfo); + d = op->d.func.fn_addr(fcinfo); + *op->resvalue = d; *op->resnull = fcinfo->isnull; EEO_NEXT(); @@ -658,6 +673,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) FunctionCallInfo fcinfo = op->d.func.fcinfo_data; bool *argnull = fcinfo->argnull; int argno; + Datum d; /* strict function, so check for NULL args */ for (argno = 0; argno < op->d.func.nargs; argno++) @@ -669,7 +685,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) } } fcinfo->isnull = false; - *op->resvalue = op->d.func.fn_addr(fcinfo); + d = op->d.func.fn_addr(fcinfo); + *op->resvalue = d; *op->resnull = fcinfo->isnull; strictfail: @@ -680,11 +697,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) { FunctionCallInfo fcinfo = op->d.func.fcinfo_data; PgStat_FunctionCallUsage fcusage; + Datum d; pgstat_init_function_usage(fcinfo, &fcusage); fcinfo->isnull = false; - *op->resvalue = op->d.func.fn_addr(fcinfo); + d = op->d.func.fn_addr(fcinfo); + *op->resvalue = d; *op->resnull = fcinfo->isnull; pgstat_end_function_usage(&fcusage, true); @@ -698,6 +717,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) PgStat_FunctionCallUsage fcusage; bool *argnull = fcinfo->argnull; int argno; + Datum d; /* strict function, so check for NULL args */ for (argno = 0; argno < op->d.func.nargs; argno++) @@ -712,7 +732,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) pgstat_init_function_usage(fcinfo, &fcusage); fcinfo->isnull = false; - *op->resvalue = op->d.func.fn_addr(fcinfo); + d = op->d.func.fn_addr(fcinfo); + *op->resvalue = d; *op->resnull = fcinfo->isnull; pgstat_end_function_usage(&fcusage, true); @@ -1113,6 +1134,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) if (!op->d.iocoerce.finfo_in->fn_strict || str != NULL) { FunctionCallInfo fcinfo_in; + Datum d; fcinfo_in = op->d.iocoerce.fcinfo_data_in; fcinfo_in->arg[0] = PointerGetDatum(str); @@ -1120,7 +1142,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) /* second and third arguments are already set up */ fcinfo_in->isnull = false; - *op->resvalue = FunctionCallInvoke(fcinfo_in); + d = FunctionCallInvoke(fcinfo_in); + *op->resvalue = d; /* Should get null result if and only if str is NULL */ if (str == NULL) @@ -1268,6 +1291,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_CASE(EEOP_ROWCOMPARE_STEP) { FunctionCallInfo fcinfo = op->d.rowcompare_step.fcinfo_data; + Datum d; /* force NULL result if strict fn and NULL input */ if (op->d.rowcompare_step.finfo->fn_strict && @@ -1279,7 +1303,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) /* Apply comparison function */ fcinfo->isnull = false; - *op->resvalue = op->d.rowcompare_step.fn_addr(fcinfo); + d = op->d.rowcompare_step.fn_addr(fcinfo); + *op->resvalue = d; /* force NULL result if NULL function result */ if (fcinfo->isnull) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 18:39:03 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2017-09-28 16:21:34 -0400, Tom Lane wrote: > >> We could save a pointless register spill > >> and reload if there were a temporary variable in there, > > > Makes sense. Do you want to make it so, or shall I? > > I just finished testing a patch, as attached. On my machine (again, > not latest gcc: 4.4.7 on RHEL6 x86_64), it reduces the code size of > execExprInterp.o by a fraction of a percent, and it seems to offer > a slight benefit in "pgbench -S" performance although I'd not put > much stock in that being reproducible. Cool. > + * Note: the reason for using a temporary variable "d", here and in > + * other places, is that some compilers think "*op->resvalue = f();" > + * requires them to evaluate op->resvalue into a register before > + * calling f(), just in case f() is able to modify op->resvalue > + * somehow. The extra line of code can save a useless register spill > + * and reload, on architectures without many registers. I'd remove the "without many registers" bit - that's really more an functioncall ABI question (#caller vs #callee saved registers) than about the actual architecture. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes: > On 2017-09-28 18:39:03 -0400, Tom Lane wrote: >> + * Note: the reason for using a temporary variable "d", here and in >> + * other places, is that some compilers think "*op->resvalue = f();" >> + * requires them to evaluate op->resvalue into a register before >> + * calling f(), just in case f() is able to modify op->resvalue >> + * somehow. The extra line of code can save a useless register spill >> + * and reload, on architectures without many registers. > I'd remove the "without many registers" bit - that's really more an > functioncall ABI question (#caller vs #callee saved registers) than > about the actual architecture. Fair enough. I wondered how pervasive this behavior is. AFAICS it is *not* required by the C standard; C99 does not say that the left operand of assignment must be evaluated first, in fact it says that the order of evaluation is unspecified. But the latest gcc I have at hand (6.4.1 on Fedora 25) still does it this way. OTOH, Apple's latest edition of clang (LLVM version 9.0.0 (clang-900.0.37)) appears to be just fine with waiting till after the function call to load op->resvalue. So that's not many data points, but it does suggest that this is worth fixing, and is not just an artifact of an old compiler version. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 20:01:37 -0400, Tom Lane wrote: > I wondered how pervasive this behavior is. AFAICS it is *not* required > by the C standard; C99 does not say that the left operand of assignment > must be evaluated first, in fact it says that the order of evaluation is > unspecified. FWIW, it's being specificied in C++17 ([1]), which seems to make it not unlikely to end up in C as well. > but it does suggest that this is worth fixing, and is not just an artifact > of an old compiler version. +1, I saw the same in a bleeding edge c++ version. Greetings, Andres Freund [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 20:56:57 -0700, Andres Freund wrote: > +1, I saw the same in a bleeding edge c++ version. err, gcc version. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers