Re: General purpose hashing func in pgbench - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: General purpose hashing func in pgbench
Date
Msg-id 1ad42902-ef1f-a715-7010-1288fb8aae89@sigaev.ru
Whole thread Raw
In response to Re: General purpose hashing func in pgbench  (Ildar Musin <i.musin@postgrespro.ru>)
Responses Re: General purpose hashing func in pgbench  (Ildar Musin <i.musin@postgrespro.ru>)
List pgsql-hackers
>> Patch applies, compiles, pgbench & global "make check" ok, doc built ok.

Agree.

If I understand upthread correctly, implementation of Murmur hash algorithm 
based on Austin Appleby work 
https://github.com/aappleby/smhasher/blob/master/src/MurmurHash2.cpp

If so, I have notice and objections:

1) Seems, it's good idea to add credits to Austin Appleby to comments.

2) Reference implementaion directly says (link above):
// 2. It will not produce the same results on little-endian and big-endian
//    machines.

I don't think that is good thing for testing and benchmarking for several 
reasons: it could produce different data collection, different selects, 
different distribution.

3) Again, from comments of reference implementation:
// Note - This code makes a few assumptions about how your machine behaves -
// 1. We can read a 4-byte value from any address without crashing

It's not true for all supported platforms. Any box with strict aligment will 
SIGBUSed here.



-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


pgsql-hackers by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Faster inserts with mostly-monotonically increasing values
Next
From: Darafei "Komяpa" Praliaskouski
Date:
Subject: Re: All Taxi Services need Index Clustered Heap Append