Thread: How to detect if a postgresql gin index is bloated

How to detect if a postgresql gin index is bloated

From
"Wong, Kam Fook (TR Technology)"
Date:

I have a GIN index.  Is there a way to detect if a postgres GIN index is bloated or not?  The regular select * from pgstatindex(‘indexname’); does not work because the GIN index is not btree index.

 

Thank you

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: How to detect if a postgresql gin index is bloated

From
khan Affan
Date:
Hello Kam Fook
You can use pgstattuple extension, & you can use pg_trgm or gin_index_stats functions from the pgstattuple extension to check for bloat in GIN indexes.
Thank & Regards

Muhammad Affan (아판)

PostgreSQL Technical Support Engineer / Pakistan R&D


On Thu, Jul 25, 2024 at 2:07 AM Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com> wrote:

I have a GIN index.  Is there a way to detect if a postgres GIN index is bloated or not?  The regular select * from pgstatindex(‘indexname’); does not work because the GIN index is not btree index.

 

Thank you

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: How to detect if a postgresql gin index is bloated

From
Keith Fiske
Date:


On Thu, Jul 25, 2024 at 8:59 AM khan Affan <bawag773@gmail.com> wrote:
Hello Kam Fook
You can use pgstattuple extension, & you can use pg_trgm or gin_index_stats functions from the pgstattuple extension to check for bloat in GIN indexes.
Thank & Regards

Muhammad Affan (아판)

PostgreSQL Technical Support Engineer / Pakistan R&D


On Thu, Jul 25, 2024 at 2:07 AM Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com> wrote:

I have a GIN index.  Is there a way to detect if a postgres GIN index is bloated or not?  The regular select * from pgstatindex(‘indexname’); does not work because the GIN index is not btree index.

 

Thank you

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html


Any more insight on how to use those two options to actually calculate GIN bloat? From what I could tell pgstattuple didn't provide anything that could be used. I hadn't looked further into the other yet, but if you know how to do that already that info would be great.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How to detect if a postgresql gin index is bloated

From
Laurenz Albe
Date:
On Thu, 2024-07-25 at 23:01 -0400, Keith Fiske wrote:
> Any more insight on how to use those two options to actually calculate GIN bloat?
> From what I could tell pgstattuple didn't provide anything that could be used.
> I hadn't looked further into the other yet, but if you know how to do that already
> that info would be great.

I don't think that there is anything smarter than to create a new index with the
same definition and see if it is much smaller than the original index...

Yours,
Laurenz Albe



Re: How to detect if a postgresql gin index is bloated

From
Muhammad Ikram
Date:
Hi Wong,

Could you check the size of the table and Index both and see whether index size is disproportionately larger than the size of the table ? If yes then adopt the smarter way suggested in above response by Laurenz Albe

Regards,
Ikram


On Fri, Jul 26, 2024 at 9:58 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2024-07-25 at 23:01 -0400, Keith Fiske wrote:
> Any more insight on how to use those two options to actually calculate GIN bloat?
> From what I could tell pgstattuple didn't provide anything that could be used.
> I hadn't looked further into the other yet, but if you know how to do that already
> that info would be great.

I don't think that there is anything smarter than to create a new index with the
same definition and see if it is much smaller than the original index...

Yours,
Laurenz Albe




--
Muhammad Ikram

Re: How to detect if a postgresql gin index is bloated

From
khan Affan
Date:

The pgstattuple function provides the following useful columns:

  • table_len: The length of the table.
  • tuple_count: The number of tuples.
  • tuple_len: The length of the tuples.
  • dead_tuple_count: The number of dead tuples.
  • dead_tuple_len: The length of dead tuples.
  • free_space: The amount of free space.

Example

How you might use this function:


SELECT * FROM pgstattuple('gin_index_name');

Interpreting Results

  • If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
  • If free_space is a significant portion of table_len, the index may be bloated.

Using pg_trgm for Trigram Indexes

If you're using a GIN index with the pg_trgm extension (for trigram indexes), you might also consider analyzing the table and index sizes directly:


SELECT pg_size_pretty(pg_relation_size('table_name')) AS table_size, pg_size_pretty(pg_total_relation_size('table_name')) AS total_size, pg_size_pretty(pg_relation_size('gin_index_name')) AS index_size;

Vacuum and Reindex

If you determine that your GIN index is bloated, you can use VACUUM and REINDEX to reclaim space:

VACUUM ANALYZE your_table_name; REINDEX INDEX your_gin_index_name;

By following these steps, you should be able to detect and address bloat in your PostgreSQL GIN indexes effectively.


On Fri, Jul 26, 2024 at 8:01 AM Keith Fiske <keith.fiske@crunchydata.com> wrote:


On Thu, Jul 25, 2024 at 8:59 AM khan Affan <bawag773@gmail.com> wrote:
Hello Kam Fook
You can use pgstattuple extension, & you can use pg_trgm or gin_index_stats functions from the pgstattuple extension to check for bloat in GIN indexes.
Thank & Regards

Muhammad Affan (아판)

PostgreSQL Technical Support Engineer / Pakistan R&D


On Thu, Jul 25, 2024 at 2:07 AM Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com> wrote:

I have a GIN index.  Is there a way to detect if a postgres GIN index is bloated or not?  The regular select * from pgstatindex(‘indexname’); does not work because the GIN index is not btree index.

 

Thank you

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html


Any more insight on how to use those two options to actually calculate GIN bloat? From what I could tell pgstattuple didn't provide anything that could be used. I hadn't looked further into the other yet, but if you know how to do that already that info would be great.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How to detect if a postgresql gin index is bloated

From
Keith Fiske
Date:


On Fri, Jul 26, 2024 at 6:49 AM khan Affan <bawag773@gmail.com> wrote:

The pgstattuple function provides the following useful columns:

  • table_len: The length of the table.
  • tuple_count: The number of tuples.
  • tuple_len: The length of the tuples.
  • dead_tuple_count: The number of dead tuples.
  • dead_tuple_len: The length of dead tuples.
  • free_space: The amount of free space.

Example

How you might use this function:


SELECT * FROM pgstattuple('gin_index_name');

Interpreting Results

  • If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
  • If free_space is a significant portion of table_len, the index may be bloated.

Using pg_trgm for Trigram Indexes

If you're using a GIN index with the pg_trgm extension (for trigram indexes), you might also consider analyzing the table and index sizes directly:


SELECT pg_size_pretty(pg_relation_size('table_name')) AS table_size, pg_size_pretty(pg_total_relation_size('table_name')) AS total_size, pg_size_pretty(pg_relation_size('gin_index_name')) AS index_size;

Vacuum and Reindex

If you determine that your GIN index is bloated, you can use VACUUM and REINDEX to reclaim space:

VACUUM ANALYZE your_table_name; REINDEX INDEX your_gin_index_name;

By following these steps, you should be able to detect and address bloat in your PostgreSQL GIN indexes effectively.




Have you tried using that pgstattuple function with a GIN index?

postgres=# select * from pgstattuple('employees_systems_access');
ERROR:  "employees_systems_access" (gin index) is not supported 

There is a function specifically for them and it does not return the same info

postgres=# select * from pgstatginindex('employees_systems_access');
-[ RECORD 1 ]--+--
version        | 2
pending_pages  | 0
pending_tuples | 0

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Re: How to detect if a postgresql gin index is bloated

From
khan Affan
Date:
Hi

Let me clarify this for you. Although pgstattuple is a widely used extension included in PostgreSQL’s source code,

It is not precompiled with the database installation. To use it, you first need to build it from the source code in the 'contrib' directory. 

After building the extension, start your PostgreSQL server and connect to the database using psql.

For your case:postgres=# select * from pgstattuple('employees_systems_access');
ERROR:  "employees_systems_access" (gin index) is not supported; I misprinted table_name with gindex_name.

this is true pgstattuple  support table and standard indexes 

I consider analyzing the table and index sizes directly,
If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
If free_space is a significant portion of table_len, the index may be bloated.

And used pgstattuple for tables and standard indexes.
pgstatginindex specifically for GIN indexes to get relevant statistics and detect bloat.

Then, if needed, use re-index or vaccum as per requirement.

 Make sure the pgstattuple extension is installed and properly configured in your PostgreSQL database before using these functions.  

I recently build the source code from 17 beta. And attaching the extension version 1.5 file as well.


 image.png

On Fri, Jul 26, 2024 at 6:11 PM Keith Fiske <keith.fiske@crunchydata.com> wrote:


On Fri, Jul 26, 2024 at 6:49 AM khan Affan <bawag773@gmail.com> wrote:

The pgstattuple function provides the following useful columns:

  • table_len: The length of the table.
  • tuple_count: The number of tuples.
  • tuple_len: The length of the tuples.
  • dead_tuple_count: The number of dead tuples.
  • dead_tuple_len: The length of dead tuples.
  • free_space: The amount of free space.

Example

How you might use this function:


SELECT * FROM pgstattuple('gin_index_name');

Interpreting Results

  • If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
  • If free_space is a significant portion of table_len, the index may be bloated.

Using pg_trgm for Trigram Indexes

If you're using a GIN index with the pg_trgm extension (for trigram indexes), you might also consider analyzing the table and index sizes directly:


SELECT pg_size_pretty(pg_relation_size('table_name')) AS table_size, pg_size_pretty(pg_total_relation_size('table_name')) AS total_size, pg_size_pretty(pg_relation_size('gin_index_name')) AS index_size;

Vacuum and Reindex

If you determine that your GIN index is bloated, you can use VACUUM and REINDEX to reclaim space:

VACUUM ANALYZE your_table_name; REINDEX INDEX your_gin_index_name;

By following these steps, you should be able to detect and address bloat in your PostgreSQL GIN indexes effectively.




Have you tried using that pgstattuple function with a GIN index?

postgres=# select * from pgstattuple('employees_systems_access');
ERROR:  "employees_systems_access" (gin index) is not supported 

There is a function specifically for them and it does not return the same info

postgres=# select * from pgstatginindex('employees_systems_access');
-[ RECORD 1 ]--+--
version        | 2
pending_pages  | 0
pending_tuples | 0

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com
Attachment

RE: [EXT] Re: How to detect if a postgresql gin index is bloated

From
"Wong, Kam Fook (TR Technology)"
Date:

Thank you Affan, Keith, Laurenze, Ikram. 

 

From: khan Affan <bawag773@gmail.com>
Sent: Friday, July 26, 2024 9:47 AM
To: Keith Fiske <keith.fiske@crunchydata.com>
Cc: Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com>; Pgsql-admin <pgsql-admin@lists.postgresql.org>
Subject: [EXT] Re: How to detect if a postgresql gin index is bloated

 

External Email: Use caution with links and attachments.

 

Hi

Let me clarify this for you. Although pgstattuple is a widely used extension included in PostgreSQL’s source code,

It is not precompiled with the database installation. To use it, you first need to build it from the source code in the 'contrib' directory. 

After building the extension, start your PostgreSQL server and connect to the database using psql.

For your case:

postgres=# select * from pgstattuple('employees_systems_access');
ERROR:  "employees_systems_access" (gin index) is not supported; I misprinted table_name with gindex_name.

this is true pgstattuple  support table and standard indexes 

I consider analyzing the table and index sizes directly,
If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
If free_space is a significant portion of table_len, the index may be bloated.

And used pgstattuple for tables and standard indexes.
pgstatginindex specifically for GIN indexes to get relevant statistics and detect bloat.

Then, if needed, use re-index or vaccum as per requirement.

 Make sure the pgstattuple extension is installed and properly configured in your PostgreSQL database before using these functions.  

I recently build the source code from 17 beta. And attaching the extension version 1.5 file as well.

 

 

On Fri, Jul 26, 2024 at 6:11PM Keith Fiske <keith.fiske@crunchydata.com> wrote:

 

 

On Fri, Jul 26, 2024 at 6:49AM khan Affan <bawag773@gmail.com> wrote:

The pgstattuple function provides the following useful columns:

  • table_len: The length of the table.
  • tuple_count: The number of tuples.
  • tuple_len: The length of the tuples.
  • dead_tuple_count: The number of dead tuples.
  • dead_tuple_len: The length of dead tuples.
  • free_space: The amount of free space.

Example

How you might use this function:

 
SELECT * FROM pgstattuple('gin_index_name');

Interpreting Results

  • If the dead_tuple_count is high relative to the tuple_count, this indicates bloat.
  • If free_space is a significant portion of table_len, the index may be bloated.

Using pg_trgm for Trigram Indexes

If you're using a GIN index with the pg_trgm extension (for trigram indexes), you might also consider analyzing the table and index sizes directly:

 
SELECT
  pg_size_pretty(pg_relation_size('table_name')) AS table_size,
  pg_size_pretty(pg_total_relation_size('table_name')) AS total_size,
  pg_size_pretty(pg_relation_size('gin_index_name')) AS index_size;

Vacuum and Reindex

If you determine that your GIN index is bloated, you can use VACUUM and REINDEX to reclaim space:

VACUUM ANALYZE your_table_name;
REINDEX INDEX your_gin_index_name;

By following these steps, you should be able to detect and address bloat in your PostgreSQL GIN indexes effectively.

 

 

 

Have you tried using that pgstattuple function with a GIN index?

postgres=# select * from pgstattuple('employees_systems_access');
ERROR:  "employees_systems_access" (gin index) is not supported 

 

There is a function specifically for them and it does not return the same info

 

postgres=# select * from pgstatginindex('employees_systems_access');
-[ RECORD 1 ]--+--
version        | 2
pending_pages  | 0
pending_tuples | 0

 

--

Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

Attachment