Enforcing unique column with triggers and hash - Mailing list pgsql-general

From Data Growth Pty Ltd
Subject Enforcing unique column with triggers and hash
Date
Msg-id AANLkTinkROFxSB5nWSlX8dLfpNStI8pwgm-ftnQebehL@mail.gmail.com
Whole thread Raw
List pgsql-general
I have a large table (200 million rows) with a column ( 'url' character varying(255)) that I need to be unique.

Currently I do this via a UNIQUE btree index on (lower(url::text))

The index is huge, and I would like to make it much smaller.  Accesses to the table via this key are a tiny portion of the total (<1%), mainly being INSERTs which are not time critical.  SELECTs very rarely use this column in WHERE, and never for time-critical queries.

I would also like to partition the table, but this column and index is not a good choice for the partitioning criteria.

I thought I might be able to create another column ('url_hash'), being a say 64-bit hash of lower(url::text) with a UNIQUE constraint.  64 bits should give me 36 bits of randomness over my 2^28 rows, making the probability of a false match sufficiently rare (probably much rarer than the risk of a programming bug causing a false match).

Alternatively, I could use an even shorter hash (say 32 bits), and allow for the possibility of hash collisions.

Does anybody know of any reference to using a hash in postgresql as a substitute for a unique btree?  I would like to avoid re-inventing the wheel if possible.

Stephen

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: conditional rules VS 1 unconditional rule with multiple commands?
Next
From: Nilesh Govindarajan
Date:
Subject: Re: No lidbl.so in libpq.so (postgresql 8.4.4)