On Sat, Feb 10, 2024 at 03:52:38PM -0800, Noah Misch wrote:
> On Fri, Feb 09, 2024 at 08:33:23PM -0800, Andres Freund wrote:
>> My understanding is that the ifunc mechanism just avoid the need for repeated
>> indirect calls/jumps to implement a single function call, not the use of
>> indirect function calls at all. Calls into shared libraries, like libc, are
>> indirected via the GOT / PLT, i.e. an indirect function call/jump. Without
>> ifuncs, the target of the function call would then have to dispatch to the
>> resolved function. Ifuncs allow to avoid this repeated dispatch by moving the
>> dispatch to the dynamic linker stage, modifying the contents of the GOT/PLT to
>> point to the right function. Thus ifuncs are an optimization when calling a
>> function in a shared library that's then dispatched depending on the cpu
>> capabilities.
>>
>> However, in our case, where the code is in the same binary, function calls
>> implemented in the main binary directly (possibly via a static library) don't
>> go through GOT/PLT. In such a case, use of ifuncs turns a normal direct
>> function call into one going through the GOT/PLT, i.e. makes it indirect. The
>> same is true for calls within a shared library if either explicit symbol
>> visibility is used, or -symbolic, -Wl,-Bsymbolic or such is used. Therefore
>> there's no efficiency gain of ifuncs over a call via function pointer.
>>
>>
>> This isn't because ifunc is implemented badly or something - the reason for
>> this is that dynamic relocations aren't typically implemented by patching all
>> callsites (".text relocations"), which is what you would need to avoid the
>> need for an indirect call to something that fundamentally cannot be a constant
>> address at link time. The reason text relocations are disfavored is that
>> they can make program startup quite slow, that they require allowing
>> modifications to executable pages which are disliked due to the security
>> implications, and that they make the code non-shareable, as the in-memory
>> executable code has to differ from the on-disk code.
>>
>>
>> I actually think ifuncs within the same binary are a tad *slower* than plain
>> function pointer calls, unless -fno-plt is used. Without -fno-plt, an ifunc is
>> called by 1) a direct call into the PLT, 2) loading the target address from
>> the GOT, 3) making an an indirect jump to that address. Whereas a "plain
>> indirect function call" is just 1) load target address from variable 2) making
>> an indirect jump to that address. With -fno-plt the callsites themselves load
>> the address from the GOT.
>
> That sounds more accurate than what I wrote. Thanks.
+1, thanks for the detailed explanation, Andres.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com