Re: Comment fix and question about dshash.c - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Comment fix and question about dshash.c
Date
Msg-id 6b4d2537-a991-c2cd-1716-cc90c93d3b57@2ndquadrant.com
Whole thread Raw
In response to Re: Comment fix and question about dshash.c  (Antonin Houska <ah@cybertec.at>)
Responses Re: Comment fix and question about dshash.c  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On 10/27/2018 01:51 PM, Antonin Houska wrote:
> Antonin Houska <ah@cybertec.at> wrote:
> 
>> Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>
>>> On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska <ah@cybertec.at> wrote:
>>>
>>> Are you saying there is a bug in this logic (which is nbuckets * 0.75
>>> expressed as integer maths), or saying that 0.75 is not a good maximum
>>> load factor?  I looked around at a couple of general purpose hash
>>> tables and saw that some used 0.75 and some used 1.0, as a wasted
>>> space-vs-collision trade-off.  If I have my maths right[1], with 0.75
>>> you expect to have 75 entries in ~53 buckets, but with 1.0 you expect
>>> to have 100 entries in ~64 buckets.
>>
>> I don't know how exactly you apply the [1] formula (what is "n" and what is
>> "N" in your case?), but my consideration was much simpler: For example, if
>> BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more
>> convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus the
>> hashtable gets resized if we're going to add the 7th entry to the partition,
>> i.e. we the number of entries in the partition is lower than the number of
>> buckets. Is that o.k.?
> 
> Well, it may be o.k. I've just checked what the fill factor means in hash
> index and it's also the number of entries divided by the number of buckets.
> 

Using load factor ~0.75 is definitely the right thing to do. One way to 
interpret it is "average chain length" (or whatever is the approach in 
that particular hash table implementations) and one of the main points 
of hash tables is to eliminate linear scans. We could pick a value 
closer to 1.0, but experience says it's not worth it - it's easy to get 
a annoyingly long chains due to hash collisions, in exchange for fairly 
minimal space savings.

That being said, I wonder if we should tweak NTUP_PER_BUCKET=1 in hash 
joins to a lower value. IIRC we ended up with 1.0 because originally it 
was set to 10.0, and we reduced it to 1.0 in 9.5 which gave us nice 
speedup. But I don't recall if we tried using even lower value (probably 
not). Maybe we don't need to do that because we only build the hash 
table at the very end, when we exactly know how many entries will it 
contain, so we don't need to do lookups and inserts at the same time, 
and we don't need to grow the hash table (at least in the non-parallel 
case). And we end up with 0.75 load factor on average, due to the 
doubling (the sizes are essentially uniformly distributed between 
0.5+epsilon and 1.0-epsilon).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Antonin Houska
Date:
Subject: Re: Comment fix and question about dshash.c
Next
From: Amit Kapila
Date:
Subject: Re: Resetting PGPROC atomics in ProcessInit()