Re: Question regarding fast-hashing in PGSQL - Mailing list pgsql-performance

From Tom Lane
Subject Re: Question regarding fast-hashing in PGSQL
Date
Msg-id 9434.1568839177@sss.pgh.pa.us
Whole thread Raw
In response to Question regarding fast-hashing in PGSQL  (Stephen Conley <cheetah@tanabi.org>)
List pgsql-performance
Stephen Conley <cheetah@tanabi.org> writes:
> My idea was to hash the string to a bigint, because the likelihood of all 3
> columns colliding is almost 0, and if a duplicate does crop up, it isn't
> the end of the world.

> However, Postgresql doesn't seem to have any 'native' hashing calls that
> result in a bigint.

regression=# \df hashtext*
                               List of functions
   Schema   |       Name       | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
 pg_catalog | hashtext         | integer          | text                | func
 pg_catalog | hashtextextended | bigint           | text, bigint        | func
(2 rows)

The "extended" hash API has only been there since v11, so you
couldn't rely on it if you need portability to old servers.
But otherwise it seems to respond precisely to your question.

If you do need portability ... does the text string's part of the
hash *really* have to be 64 bits wide?  Why not just concatenate
it with a 32-bit hash of the other fields?

            regards, tom lane



pgsql-performance by date:

Previous
From: Mariel Cherkassky
Date:
Subject: Re: pg12 - migrate tables to partitions structure
Next
From: Michael Lewis
Date:
Subject: Re: pg12 - migrate tables to partitions structure