Thread: Reindex doesn't eliminate bloat

Reindex doesn't eliminate bloat

From
Ron Johnson
Date:
v8.4.12

According to this (https://pastebin.com/TJB32n5M) query, which I thought I 
got from https://wiki.postgresql.org/wiki/Index_Maintenance, a list of 
indexes and their bloat is generated.

After reindexing a table with a large amount of reported bloat (column 
bloat_pct says 29%), re-running the query shows no change in the amount of 
bloat.  This is a historical table, and VACUUM VERBOSE shows that there's 
nothing to free up.

Is this something that I must live with, or am I misinterpreting the query?

Thanks,

-- 
Angular momentum makes the world go 'round.


Re: Reindex doesn't eliminate bloat

From
Nikolay Samokhvalov
Date:
On Tue, Mar 13, 2018 at 1:05 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
v8.4.12
 
This is *very* old version, not supported by the community for many years. Check https://www.postgresql.org/ to see currently supported versions.
You need to upgrade it.

Re: Reindex doesn't eliminate bloat

From
Ron Johnson
Date:
On 03/12/2018 05:20 PM, Nikolay Samokhvalov wrote:
On Tue, Mar 13, 2018 at 1:05 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
v8.4.12
 
This is *very* old version, not supported by the community for many years. Check https://www.postgresql.org/ to see currently supported versions.
You need to upgrade it.

Don't even think I'm in control of when -- or even if -- the customer decides to upgrade.

That being the case, do you have an answer to the question?


--
Angular momentum makes the world go 'round.

Re: Reindex doesn't eliminate bloat

From
Adrian Klaver
Date:
On 03/12/2018 03:05 PM, Ron Johnson wrote:
> v8.4.12
> 
> According to this (https://pastebin.com/TJB32n5M) query, which I thought 
> I got from https://wiki.postgresql.org/wiki/Index_Maintenance, a list of 
> indexes and their bloat is generated.
> 
> After reindexing a table with a large amount of reported bloat (column 
> bloat_pct says 29%), re-running the query shows no change in the amount 

First I am not seeing a column bloat_pct in the query you linked to, so 
are you sure that is the actual query you used?

> of bloat.  This is a historical table, and VACUUM VERBOSE shows that 
> there's nothing to free up.
> 
> Is this something that I must live with, or am I misinterpreting the query?

Honestly I have not worked my way in depth through the query you show, 
though I did notice it uses pg_stats. What happens if run ANALYZE 
(https://www.postgresql.org/docs/8.4/static/sql-analyze.html) to update 
the stats?

> 
> Thanks,
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Reindex doesn't eliminate bloat

From
Ron Johnson
Date:
On 03/12/2018 05:55 PM, Adrian Klaver wrote:
> On 03/12/2018 03:05 PM, Ron Johnson wrote:
>> v8.4.12
>>
>> According to this (https://pastebin.com/TJB32n5M) query, which I thought 
>> I got from https://wiki.postgresql.org/wiki/Index_Maintenance, a list of 
>> indexes and their bloat is generated.
>>
>> After reindexing a table with a large amount of reported bloat (column 
>> bloat_pct says 29%), re-running the query shows no change in the amount 
>
> First I am not seeing a column bloat_pct in the query you linked to, so 
> are you sure that is the actual query you used?

Sorry.  bloat_pct is renamed bloat_ratio.

>
>> of bloat.  This is a historical table, and VACUUM VERBOSE shows that 
>> there's nothing to free up.
>>
>> Is this something that I must live with, or am I misinterpreting the query?
>
> Honestly I have not worked my way in depth through the query you show, 
> though I did notice it uses pg_stats. What happens if run ANALYZE 
> (https://www.postgresql.org/docs/8.4/static/sql-analyze.html) to update 
> the stats?


I did ANALYZE VERBOSE on the underlying table.  No change.


-- 
Angular momentum makes the world go 'round.


Re: Reindex doesn't eliminate bloat

From
Nikolay Samokhvalov
Date:
On Tue, Mar 13, 2018 at 1:28 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 03/12/2018 05:20 PM, Nikolay Samokhvalov wrote:
On Tue, Mar 13, 2018 at 1:05 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
v8.4.12
 
This is *very* old version, not supported by the community for many years. Check https://www.postgresql.org/ to see currently supported versions.
You need to upgrade it.

Don't even think I'm in control of when -- or even if -- the customer decides to upgrade.

That being the case, do you have an answer to the question? 

Those queries from wiki for table and index bloat estimation are for estimation only. In many cases they show very wrong results. Better (yet not ideal) approach is using pgstattuple extension (though I'm not sure it existed back in 2009).

Can you provide table and index definition and, if you can, some sample data?

Re: Reindex doesn't eliminate bloat

From
Ron Johnson
Date:
On 03/12/2018 10:48 PM, Nikolay Samokhvalov wrote:
On Tue, Mar 13, 2018 at 1:28 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 03/12/2018 05:20 PM, Nikolay Samokhvalov wrote:
On Tue, Mar 13, 2018 at 1:05 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
v8.4.12
 
This is *very* old version, not supported by the community for many years. Check https://www.postgresql.org/ to see currently supported versions.
You need to upgrade it.

Don't even think I'm in control of when -- or even if -- the customer decides to upgrade.

That being the case, do you have an answer to the question? 

Those queries from wiki for table and index bloat estimation are for estimation only. In many cases they show very wrong results. Better (yet not ideal) approach is using pgstattuple extension (though I'm not sure it existed back in 2009).

Can you provide table and index definition and, if you can, some sample data?

Sadly, no sample data.  (It's all PCI controlled.)

Index idx_item_mapping_rp7_y2016m03itemmapping_custom_userfield_801 has 40% bloat.

Thanks.

--
Angular momentum makes the world go 'round.
Attachment

Re: Reindex doesn't eliminate bloat

From
Joe Conway
Date:
On 03/12/2018 09:16 PM, Ron Johnson wrote:
> On 03/12/2018 10:48 PM, Nikolay Samokhvalov wrote:
>> Those queries from wiki for table and index bloat estimation are for
>> estimation only. In many cases they show very wrong results. Better
>> (yet not ideal) approach is using pgstattuple extension (though I'm
>> not sure it existed back in 2009).
>>
>> Can you provide table and index definition and, if you can, some
>> sample data?
>
> Sadly, no sample data.  (It's all PCI controlled.)
>
> Index idx_item_mapping_rp7_y2016m03itemmapping_custom_userfield_801 has
> 40% bloat.

Assuming the data in the indexed column(s) is not highly correlated with
the physical table order (i.e. it is roughly random), about 50% density
is theoretically expected. In fact, in some empirical testing, I have
seen a long term steady state value of closer to 44% if I remember
correctly (but perhaps that was related to the way I was testing). For a
discussion on why this is the case, see for example:


https://www.postgresql.org/message-id/flat/87oa4xmss7.fsf%40news-spur.riddles.org.uk#87oa4xmss7.fsf@news-spur.riddles.org.uk

So what is being reported at 40% bloat is probably actually not really
bloat.

HTH,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Reindex doesn't eliminate bloat

From
Ron Johnson
Date:
On 03/13/2018 06:10 PM, Joe Conway wrote:
> On 03/12/2018 09:16 PM, Ron Johnson wrote:
>> On 03/12/2018 10:48 PM, Nikolay Samokhvalov wrote:
>>> Those queries from wiki for table and index bloat estimation are for
>>> estimation only. In many cases they show very wrong results. Better
>>> (yet not ideal) approach is using pgstattuple extension (though I'm
>>> not sure it existed back in 2009).
>>>
>>> Can you provide table and index definition and, if you can, some
>>> sample data?
>> Sadly, no sample data.  (It's all PCI controlled.)
>>
>> Index idx_item_mapping_rp7_y2016m03itemmapping_custom_userfield_801 has
>> 40% bloat.
> Assuming the data in the indexed column(s) is not highly correlated with
> the physical table order (i.e. it is roughly random), about 50% density
> is theoretically expected.

What does physical table order have to do with b-tree organization, 
especially in a freshly reindexed table using the default 90% fill factor?

>   In fact, in some empirical testing, I have
> seen a long term steady state value of closer to 44% if I remember
> correctly (but perhaps that was related to the way I was testing). For a
> discussion on why this is the case, see for example:
>
>
https://www.postgresql.org/message-id/flat/87oa4xmss7.fsf%40news-spur.riddles.org.uk#87oa4xmss7.fsf@news-spur.riddles.org.uk
>
> So what is being reported at 40% bloat is probably actually not really
> bloat.


-- 
Angular momentum makes the world go 'round.