Andres Freund <andres@anarazel.de> writes:
> On March 25, 2017 4:56:11 PM PDT, Ants Aasma <ants.aasma@eesti.ee> wrote:
>> I haven't had the time to research this properly, but initial tests
>> show that with GCC 6.2 adding
>>
>> #pragma GCC optimize ("no-crossjumping")
>>
>> fixes merging of the op tail jumps.
>>
>> Some quick and dirty benchmarking suggests that the benefit for the
>> interpreter is about 15% (5% speedup on a workload that spends 1/3 in
>> ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local
>> vars before the indirect jump is somewhere between a tiny benefit and
>> no effect, certainly not worth introducing extra complexity. Clang 3.8
>> does the correct thing out of the box and is a couple of percent
>> faster than GCC with the pragma.
> That's large enough to be worth doing (although I recall you seeing all jumps commonalized). We should probably do
thison a per function basis however (either using pragma push option, or function attributes).
Seems like it would be fine to do it on a per-file basis. If you're
worried about pessimizing the out-of-line subroutines, we could move
those to a different file --- it's pretty questionable that they're
in execExprInterp.c in the first place, considering they're meant to be
used by more than just that execution method.
regards, tom lane