Re: General purpose hashing func in pgbench - Mailing list pgsql-hackers

From Ildar Musin
Subject Re: General purpose hashing func in pgbench
Date
Msg-id 699225ee-c30c-b22e-4028-415e9eab26b2@postgrespro.ru
Whole thread Raw
In response to Re: General purpose hashing func in pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: General purpose hashing func in pgbench
Re: General purpose hashing func in pgbench
List pgsql-hackers
Hello Fabien,


20/12/2017 10:36, Fabien COELHO пишет:

> As there may be several hash functions included in the long run. I'd
> suggest that the hash function should be named more precisely, eg
> "hash_fnv1a".
Done. Added "hash_murmur2" too, see below.
> The image looks like the distribution is more regularly scattered than
> actually randomized... Maybe this is because the first highest 256
> values are really scattered by the process multiply/modulo process. Or
> maybe this is an optical effect?
>
After your comment I searched the internet for different hashing
algorithms comparison wrt randomness and found an interesting post at
stackexchange [1]. According to author's research the murmur2 algorithm
has the best randomness rate (among those he tested). So I implemented
it (using original code by Austin Appleby as a reference [2]) and
conducted few experiments. Results are in attachement. Indeed, comparing
to murmur2 the FNV distribution seems pretty regular.
> ISTM that there are undesired utf8 chars in a comment. Should be kept
> ASCII.
Oops, I copy-pasted the algorithm name from wikipedia, didn't notice
there were some fancy unicode hyphens.
> I would put the actual hash computation in a separate function rather
> than inlined in the evaluator.
Done.
> Add the submission to the next CF?
I think it is not commitfest ready yet -- I need to add some
documentation and tests first.


[1]
https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed
[2] https://github.com/aappleby/smhasher/blob/master/src/MurmurHash2.cpp

--
Ildar Musin
Postgres Professional:
http://www.postgrespro.com
Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] path toward faster partition pruning
Next
From: Thomas Munro
Date:
Subject: Re: condition variable cleanup and subtransactions