Please improve "Limitations" section for GIN indexes - Mailing list pgsql-docs

From Marcin Barczynski
Subject Please improve "Limitations" section for GIN indexes
Date
Msg-id CAOhG4we40pPe-53n9LW8cYY31Ng7XSiLde_nZjbxRj0aQ_EX9w@mail.gmail.com
Whole thread Raw
List pgsql-docs
We've learned the hard way that keys are never removed in GIN indexes, leading to infinite index size growth in certain scenarios. The PostgreSQL documentation doesn't mention it even in the "Limitations" section for GIN indexes (https://www.postgresql.org/docs/15/gin-limit.html).

The only place I found info about the unusual behavior is README file in the source code: https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README#L391.


We keep a tree-like structure in the database and use GIN indexes to find descendants of a node quickly. The simplified table looks as follows:

node_id | ancestor_ids
--------|-------------
 1      | []
 2      | [1]
 15     | [1, 2]

node_id is a sequence, and nodes are often added and removed from the table: today node_ids are in the range from 1 to 100K but next month it will be 500K-600K. Because GIN index never removes entry keys, it will contain keys from 1 to 600K. In our case the GIN index became 2.5 times larger than the table - that's the hard-way part.

How about sparing others the hard-way part and explicitly mentioning the unusual behavior in the "Limitations" section for GIN indexes?

Best regards,
Marcin Barczyński

pgsql-docs by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Make SSPI documentation clearer
Next
From: Tomas Pospisek
Date:
Subject: Re: Make SSPI documentation clearer