Home > mailing lists

Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Next Steps with Hash Indexes
Date	October 17, 2021 11:00:24
Msg-id	CANbhV-FB418MJ+1UC=sr7XhWvhz=CnVrvqMg7eAWGTGBkM6pFQ@mail.gmail.com Whole thread
In response to	Re: Next Steps with Hash Indexes (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: Next Steps with Hash Indexes
List	pgsql-hackers

Tree view

On Thu, 14 Oct 2021 at 16:09, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Oct 14, 2021 at 12:48 AM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> > The hash index tuples are 20-bytes each. If that were rounded up to
> > 8-byte alignment, then that would be 24 bytes.
> >
> > Using pageinspect, the max(live_items) on any data page (bucket or
> > overflow) is 407 items, so they can't be 24 bytes long.
>
> That's the same as an nbtree page, which confirms my suspicion. The 20
> bytes consists of a 16 byte tuple, plus a 4 byte line pointer. The
> tuple-level alignment overhead gets you from 12 bytes to 16 bytes with
> a single int4 column. So the padding is there for the taking.

Thank you for nudging me to review the tuple length.

Since hash indexes never store Nulls, and the hash is always fixed
length, ISTM that we can compress the hash index entries down to
ItemPointerData (6 bytes) plus any hashes.

That doesn't change any arguments about size differences between
approaches, but we can significantly reduce index size (by up to 50%).

-- 
Simon Riggs                http://www.EnterpriseDB.com/

pgsql-hackers by date:

From: Peter Geoghegan
Date: 17 October 2021, 02:33:57
Subject: Re: GIN pending list cleanup during autoanalyze blocks cleanup by VACUUM

From: mp39590@gmail.com
Date: 17 October 2021, 14:11:28
Subject: [PATCH] Make ENOSPC not fatal in semaphore creation

Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers

Previous

Next