Andres Freund <andres@anarazel.de> writes:
> On 2017-12-06 16:38:18 -0500, Todd A. Cook wrote:
>> I found this problem when I dropped 10.1 into a test environment to see
>> what would happen. There was no deliberate attempt to break anything.
> Read Thomas' message at:
http://archives.postgresql.org/message-id/263b03b1-3e1c-49ca-165a-8ac6751419c4%402ndquadrant.com
I'm confused by Tomas' claim that
>> (essentially hashint8 only ever produces 60% of
>> values from [0,1000000], which likely increases collision rate).
This is directly contradicted by the simple experiments I've done, eg
regression=# select count(distinct hashint8(v)) from generate_series(0,1000000::int8) v;
count
--------
999879
(1 row)
regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,1000000::int8) v;
count
--------
644157
(1 row)
regression=# select count(distinct hashint8(v) & (1024*1024-1)) from generate_series(0,10000000::int8) v;
count
---------
1048514
(1 row)
It's certainly not perfect, but I'm not observing any major failure to
span the output space.
regards, tom lane