Home > mailing lists

Re: Allow simplehash to use already-calculated hash values - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Allow simplehash to use already-calculated hash values
Date	July 17, 2019 01:46:49
Msg-id	20190716224649.vqmaubp375cnnhwn@alap3.anarazel.de Whole thread Raw
In response to	Allow simplehash to use already-calculated hash values (Jeff Davis <pgsql@j-davis.com>)
Responses	Re: Allow simplehash to use already-calculated hash values
List	pgsql-hackers

Tree view

Hi,

On 2019-07-16 15:20:33 -0700, Jeff Davis wrote:
> The attached small patch adds new entry points to simplehash.h that
> allow the caller to pass in the already-calculated hash value, so that
> simplehash doesn't need to recalculate it.
> 
> This is helpful for Memory-Bounded Hash Aggregation[1], which uses the
> hash value for multiple purposes. For instance, if the hash table is
> full and the group is not already present in the hash table, it needs
> to spill the tuple to disk. In that case, it would use the hash value
> for the initial lookup, then to select the right spill partition.
> Later, when it processes the batch, it will again need the same hash
> value to perform a lookup. By separating the hash value calculation
> from where it's used, we can avoid needlessly recalculating it for each
> of these steps.

Makes sense to me.

> In theory, this could add overhead for "SH_SCOPE extern" for callers
> not specifying their own hash value, because it adds an extra external
> function call. I looked at the generated LLVM and it's a simple tail
> call, and I looked at the generated assembly and it's just an extra
> jmp.

How does it look for gcc? And was that with LTO enabled or not?

Is that still true when the hashtable is defined in a shared library, or
when you compile postgres as a PIE executable? I'm not sure that
compilers can optimize the external function call at least in the former
case, because the typical function resolution rules IIRC mean that
references to extern functions could be resolved to definitions in other
translation units, *even if* there's a definition in the same TU.

ISTM that it'd be best to just have a static inline helper function
employed both the hash-passing and the "traditional" insertion routines?
Then that problem ought to not exist anymore.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 17 July 2019, 01:23:38
Subject: Re: POC: Cleaning up orphaned files using undo logs

From: Jerry Sievers
Date: 17 July 2019, 02:05:44
Subject: Re: SegFault on 9.6.14

Re: Allow simplehash to use already-calculated hash values - Mailing list pgsql-hackers

Previous

Next