Re: General purpose hashing func in pgbench - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: General purpose hashing func in pgbench
Date
Msg-id alpine.DEB.2.20.1801180710400.11884@lancre
Whole thread Raw
In response to Re: General purpose hashing func in pgbench  (Ildar Musin <i.musin@postgrespro.ru>)
Responses Re: General purpose hashing func in pgbench
List pgsql-hackers
Hello Ildar,

Patch v8 applies cleanly and compiles. Global and local "make check ok".
Doc build ok.

>> For me "random seed" is what is passed to srandom, so the variable
>> should rather be named hash_seed because there could also be a random
>> seed (actually, there is in another parallel patch:-). Moreover, this
>> seed may or may not be random if set, so calling it "random_seed" is
>> not desirable.
>
> My intention was to introduce seed variable which potentially could be
> used in different contexts, not only for hash functions.

Yes, good point. I'd need it for the pseudo-random permutation function.

> I renamed it to 'hash_seed' for now. But what do you think about naming 
> it simply 'seed' or choosing some other general name?

ISTM "seed" that is prone to being used for something else in a script. 
What about ":default_seed", which says it all?


Some minor suggestions:

In "make_func", I'd assert that nargs is positive in the default branch.

The default seed may be created with just one assignment instead of 
repeated |= ops. Or not:-)

In evalStandardFunc, I'd suggest to avoid the ? construction and use an 
if and a direct setIntValue in both case, removing the "result" 
variable, as done in other branches (eg bitwise operators...). Maybe 
something like:

   if (func == murmur2)
     setIntValue(retval, getHashXXX(val, seed));
   else if (...)
     ...
   else
     Assert(0);

so that it is structurally ready for other hash functions if needed.

Documentation:

In the table, use official names in the description, and add wikipedia 
links, something like:

FNV hash ->
   <ulink url="https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function">FNV-1a hash</ulink>

murmur2 hash ->
   <ulink url="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash2 hash</ulink>


In the text:

"Hash functions accepts" -> "Hash functions 
<literal>hash</literal>, <......> and <....> accept*NO S*"

"... provided hash_seed value is used" => "... provided the value of 
<literal>:hash_seed</literal> is used, which is initialized randomly 
unless set by the command-line <literal>-D</literal> option."

ISTM that the Automatic Variable table should be in alphabetical order.

-- 
Fabien.


pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: [HACKERS] Surjective functional indexes
Next
From: Amit Langote
Date:
Subject: Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables