Home > mailing lists

Re: General purpose hashing func in pgbench - Mailing list pgsql-hackers

From	Fabien COELHO
Subject	Re: General purpose hashing func in pgbench
Date	January 12, 2018 18:03:00
Msg-id	alpine.DEB.2.20.1801121555530.13422@lancre Whole thread
In response to	Re: General purpose hashing func in pgbench (Ildar Musin <i.musin@postgrespro.ru>)
List	pgsql-hackers

Tree view

Hello Ildar,

>> Hmm. I do not think that we should want a shared seed value. The seed
>> should be different for each call so as to avoid undesired
>> correlations. If wanted, correlation could be obtained by using an
>> explicit identical seed.
>
> Probably I'm missing something but I cannot see the point. If we change
> seed on every invokation then we get uniform-like distribution (see
> attached image). And we don't get the same hash value for the same input
> which is the whole point of hash functions. Maybe I didn't understand
> you correctly.

I suggest to fix the seed when parsing the script, so that it is the same 
seed on each script for a given pgbench invocation, so that for one run it 
runs with the same seed for each hash call, but changes if pgbench is 
re-invoked so that the results would be different.

Also, if hash(:i) and hash(:j) appears in two distinct scripts, ISTM that 
we do not necessarily want the same seed, otherwise i == j would correlate 
to hash(i) == hash(j), which may not be a desirable property for some use 
case.

Maybe it would be desirable for other use cases, though.


> Anyway I've attached a new version with some tests and docs added.

-- 
Fabien.

pgsql-hackers by date:

From: Robert Haas
Date: 12 January 2018, 17:54:44
Subject: Re: [HACKERS] UPDATE of partition key

From: Fabien COELHO
Date: 12 January 2018, 18:03:59
Subject: Re: General purpose hashing func in pgbench

Re: General purpose hashing func in pgbench - Mailing list pgsql-hackers

Previous

Next