store narrow values in hash indexes? - Mailing list pgsql-hackers

From Robert Haas
Subject store narrow values in hash indexes?
Date
Msg-id CA+TgmoZ1HBMc4eppP3LwmztAL3kMsStqBcjobqW2U3Gf9S1S3A@mail.gmail.com
Whole thread Raw
Responses Re: store narrow values in hash indexes?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: store narrow values in hash indexes?  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Currently, hash indexes always store the hash code in the index, but
not the actual Datum.  It's recently been noted that this can make a
hash index smaller than the corresponding btree index would be if the
column is wide.  However, if the index is being built on a fixed-width
column with a typlen <= sizeof(Datum), we could store the original
value in the hash index rather than the hash code without using any
more space.  That would complicate the code, but I bet it would be
faster: we wouldn't need to set xs_recheck, we could rule out hash
collisions without visiting the heap, and we could support index-only
scans in such cases.

Another thought is that hash codes are 32 bits, but a Datum is 64 bits
wide on most current platforms.  So we're wasting 4 bytes per index
tuple storing nothing.  If we generated 64-bit hash codes we could
store as many bits of it as a Datum will hold and reduce hash
collisions.  Alternatively, we could try to stick some other useful
information in those bytes, like an abbreviated abbreviated key.

Not sure if these are good ideas.  They're just ideas.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: PL/Python adding support for multi-dimensional arrays
Next
From: Tom Lane
Date:
Subject: Re: 9.6 TAP tests and extensions