Antonin Houska <ah@cybertec.at> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> > On Sat, Oct 27, 2018 at 12:30 AM Antonin Houska <ah@cybertec.at> wrote:
> >
> > Are you saying there is a bug in this logic (which is nbuckets * 0.75
> > expressed as integer maths), or saying that 0.75 is not a good maximum
> > load factor? I looked around at a couple of general purpose hash
> > tables and saw that some used 0.75 and some used 1.0, as a wasted
> > space-vs-collision trade-off. If I have my maths right[1], with 0.75
> > you expect to have 75 entries in ~53 buckets, but with 1.0 you expect
> > to have 100 entries in ~64 buckets.
>
> I don't know how exactly you apply the [1] formula (what is "n" and what is
> "N" in your case?), but my consideration was much simpler: For example, if
> BUCKETS_PER_PARTITION returns 8 (power of 2 is expected here and also more
> convenient), then MAX_COUNT_PER_PARTITION returns 8 / 2 + 8 / 4 = 6. Thus the
> hashtable gets resized if we're going to add the 7th entry to the partition,
> i.e. we the number of entries in the partition is lower than the number of
> buckets. Is that o.k.?
Well, it may be o.k. I've just checked what the fill factor means in hash
index and it's also the number of entries divided by the number of buckets.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com