Thread: BUG #15679: Partial HASH index takes too much space

BUG #15679: Partial HASH index takes too much space

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15679
Logged by:          Ortwin Gentz
Email address:      gentz@futuretap.com
PostgreSQL version: 10.6
Operating system:   Amazon RDS
Description:

I created a partial HASH index for a sparsely populated column:

CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT
NULL;

Even though the my_id VARCHAR(255) column is populated (NON NULL) only for a
few hundred records, the index takes 256 MB of space (for a table with > 10m
records). Also, it doesn't make a difference if the index is created as a
partial index ("WHERE my_ID IS NOT NULL") or as a full index.

In contrast to that, a BTREE index differs considerably in space for full
and partial:

CREATE INDEX full_btree ON mytable (my_id); # 543 MB
CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 16
KB

See also this StackExchange conversation where user jjanes considers the
behavior a bug in the hash index code:
https://dba.stackexchange.com/a/231660/25337


Re: BUG #15679: Partial HASH index takes too much space

From
David Rowley
Date:
On Sat, 9 Mar 2019 at 09:38, PG Bug reporting form
<noreply@postgresql.org> wrote:
> I created a partial HASH index for a sparsely populated column:
>
> CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT
> NULL;
>
> Even though the my_id VARCHAR(255) column is populated (NON NULL) only for a
> few hundred records, the index takes 256 MB of space (for a table with > 10m
> records). Also, it doesn't make a difference if the index is created as a
> partial index ("WHERE my_ID IS NOT NULL") or as a full index.
>
> In contrast to that, a BTREE index differs considerably in space for full
> and partial:
>
> CREATE INDEX full_btree ON mytable (my_id); # 543 MB
> CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 16
> KB

Thanks for the report. This issue is being discussed over in
https://www.postgresql.org/message-id/flat/CAMkU%3D1x0k%2BdRQHDUgp4BjFeSgxyLBBXyKNY5Pt1Yu6YHB0mhKA%40mail.gmail.com

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services