Re: [HACKERS] Hash Functions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Hash Functions
Date
Msg-id CA+TgmoYe3VpvuyMF3JLHUQvyA38h2arQz-pSVn8DuWY6dwbctg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Hash Functions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Hash Functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Hash Functions  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] Hash Functions  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Sat, May 13, 2017 at 12:52 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Can we think of defining separate portable hash functions which can be
> used for the purpose of hash partitioning?

I think that would be a good idea.  I think it shouldn't even be that
hard.  By data type:

- Integers.  We'd need to make sure that we get the same results for
the same value on big-endian and little-endian hardware, and that
performance is good on both systems.  That seems doable.

- Floats.  There may be different representations in use on different
hardware, which could be a problem.  Tom didn't answer my question
about whether any even-vaguely-modern hardware is still using non-IEEE
floats, which I suspect means that the answer is "no".  If every bit
of hardware we are likely to find uses basically the same
representation of the same float value, then this shouldn't be hard.
(Also, even if this turns out to be hard for floats, using a float as
a partitioning key would be a surprising choice because the default
output representation isn't even unambiguous; you need
extra_float_digits for that.)

- Strings.  There's basically only one representation for a string.
If we assume that the hash code only needs to be portable across
hardware and not across encodings, a position for which I already
argued upthread, then I think this should be manageable.

- Everything Else.  Basically, everything else is just a composite of
that stuff, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] multi-column range partition constraint