Re: Frequetly updated partial index leads to bloat on index for Postresql 11 - Mailing list pgsql-general

From Tom Dearman
Subject Re: Frequetly updated partial index leads to bloat on index for Postresql 11
Date
Msg-id 8563E084-A6B1-41E6-BCAB-B7D31E75C981@gmail.com
Whole thread Raw
In response to Re: Frequetly updated partial index leads to bloat on index for Postresql 11  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Frequetly updated partial index leads to bloat on index for Postresql 11  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-general
Other indexes do bloat, but the percentage bloat is a lot less, presumably because this is a partial index where the
partialcolumn has a high degree of changes ie maybe 100 genuinely ‘live’ rows in a table of 300 million where every row
hasgone through a state where it would have been in the index.  In some of our partitions we might have 2000 old rows
thatdo hang around for a long time and another 100 or so ‘real’ partial index entries so 2200 in total but the number
ofrows would be 300 million so it is a lot less than 1%. 


> On 16 Jul 2021, at 16:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Tom Dearman <tom.dearman@gmail.com> writes:
>> We have change autovacuum so that it runs more frequently autovacuum_vacuum_scale_factor=0.035, the reason we have a
partialindex on the status is that in a table of 300 million entries, only about 100 or so would have
status=‘IN_PROGRESS’so we think this should be a nice small index and many of our queries want to look up with a where
clausestatus=‘IN_PROGRESS’.  In theory it works well, but we get a lot of index bloat as there is a lot of churn on the
statusvalue, ie each row starts as IN_PROGRESS and then goes to one of 4 possible completed statuses.  
>
> Is it really the case that only this index is bloating?  In principle, an
> update on a row of the table should result in new entries in every index
> of the table.  A partial index, due to the filter applied to possibly not
> store any index entry, should in theory have less bloat than other
> indexes.
>
> If that's not what you're seeing, there must be something about the data
> being stored in that index (not the partial-index filter condition) that
> results in a lot of low-occupancy index pages over time.  You didn't say
> anything about what the data payload is.  But we've seen bloat problems in
> indexes where, say, every tenth or hundredth value in the index ordering
> would persist for a long time while the ones in between get deleted
> quickly.  That leads to low-density indexes that VACUUM can't do anything
> about.
>
>             regards, tom lane




pgsql-general by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: Frequetly updated partial index leads to bloat on index for Postresql 11
Next
From: David Gauthier
Date:
Subject: Getting "ERROR: upper bound of FOR loop cannot be null" when trying to use a stored proc in a check constraint def.