Hi,
On 2019-07-16 15:20:33 -0700, Jeff Davis wrote:
> The attached small patch adds new entry points to simplehash.h that
> allow the caller to pass in the already-calculated hash value, so that
> simplehash doesn't need to recalculate it.
>
> This is helpful for Memory-Bounded Hash Aggregation[1], which uses the
> hash value for multiple purposes. For instance, if the hash table is
> full and the group is not already present in the hash table, it needs
> to spill the tuple to disk. In that case, it would use the hash value
> for the initial lookup, then to select the right spill partition.
> Later, when it processes the batch, it will again need the same hash
> value to perform a lookup. By separating the hash value calculation
> from where it's used, we can avoid needlessly recalculating it for each
> of these steps.
Makes sense to me.
> In theory, this could add overhead for "SH_SCOPE extern" for callers
> not specifying their own hash value, because it adds an extra external
> function call. I looked at the generated LLVM and it's a simple tail
> call, and I looked at the generated assembly and it's just an extra
> jmp.
How does it look for gcc? And was that with LTO enabled or not?
Is that still true when the hashtable is defined in a shared library, or
when you compile postgres as a PIE executable? I'm not sure that
compilers can optimize the external function call at least in the former
case, because the typical function resolution rules IIRC mean that
references to extern functions could be resolved to definitions in other
translation units, *even if* there's a definition in the same TU.
ISTM that it'd be best to just have a static inline helper function
employed both the hash-passing and the "traditional" insertion routines?
Then that problem ought to not exist anymore.
Greetings,
Andres Freund