Home > mailing lists

Re: Question regarding fast-hashing in PGSQL - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Question regarding fast-hashing in PGSQL
Date	September 18, 2019 20:39:37
Msg-id	9434.1568839177@sss.pgh.pa.us Whole thread Raw
In response to	Question regarding fast-hashing in PGSQL (Stephen Conley <cheetah@tanabi.org>)
List	pgsql-performance

Tree view

Stephen Conley <cheetah@tanabi.org> writes:
> My idea was to hash the string to a bigint, because the likelihood of all 3
> columns colliding is almost 0, and if a duplicate does crop up, it isn't
> the end of the world.

> However, Postgresql doesn't seem to have any 'native' hashing calls that
> result in a bigint.

regression=# \df hashtext*
                               List of functions
   Schema   |       Name       | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
 pg_catalog | hashtext         | integer          | text                | func
 pg_catalog | hashtextextended | bigint           | text, bigint        | func
(2 rows)

The "extended" hash API has only been there since v11, so you
couldn't rely on it if you need portability to old servers.
But otherwise it seems to respond precisely to your question.

If you do need portability ... does the text string's part of the
hash *really* have to be 64 bits wide?  Why not just concatenate
it with a 32-bit hash of the other fields?

            regards, tom lane

pgsql-performance by date:

From: Mariel Cherkassky
Date: 18 September 2019, 20:13:14
Subject: Re: pg12 - migrate tables to partitions structure

From: Michael Lewis
Date: 18 September 2019, 20:56:22
Subject: Re: pg12 - migrate tables to partitions structure

Re: Question regarding fast-hashing in PGSQL - Mailing list pgsql-performance

Previous

Next