Re: Index bloat and REINDEX/VACUUM optimization for partial index - Mailing list pgsql-performance

From jayaprabhakar k
Subject Re: Index bloat and REINDEX/VACUUM optimization for partial index
Date
Msg-id CA+t=SiJGPasfKRxt8v=P5c4X+JuiWMjZcWWjy4p31CV8OhT+WQ@mail.gmail.com
Whole thread Raw
In response to Re: Index bloat and REINDEX/VACUUM optimization for partial index  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-performance
Thanks Peter. It is 14.4, But on AWS RDS Aurora instance. I am trying to read the links you shared - B-Tree Deletion and deduplication, etc. I still don't fully understand what I need to do. In the BTree documentation,

The average and worst-case number of versions per logical row can be kept low purely through targeted incremental deletion passes. It's quite possible that the on-disk size of certain indexes will never increase by even one single page/block despite constant version churn from UPDATEs.

In our case, almost all the tuples stop being covered by the index as they fail the predicate, and only a tiny 1000s of rows pass the index predicate at any point in time. But, we still see the index size continue to increase, index lookups become slow over time, and  vacuum (non full) doesn't reduce the index size much.

Do we need to do anything specific to better utilize the targeted incremental deletion passes?


SELECT VERSION();
                                             version                                            
-------------------------------------------------------------------------------------------------
 PostgreSQL 14.4 on x86_64-pc-linux-gnu, compiled by x86_64-pc-linux-gnu-gcc (GCC) 7.4.0, 64-bit
(1 row)







On Mon, 28 Aug 2023 at 18:49, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:
> REINDEX requires a full table scan
>
> Roughly create a new index, rename index, drop old index.
> REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.

You didn't say which Postgres version you're on. Note that Postgres 14
can deal with index bloat a lot better than earlier versions could.
This is known to work well with partial indexes. See:

https://www.postgresql.org/message-id/flat/CAL9smLAjt9mZC2%3DqBeJwuNPq7KMAYGTWWQw_hvA-Lfo0b3ycow%40mail.gmail.com

--
Peter Geoghegan

pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: Range partitioning query performance with date_trunc (vs timescaledb)
Next
From: Jeff Janes
Date:
Subject: Re: Queries containing ORDER BY and LIMIT started to work slowly