Re: Performance degradation, index bloat and planner estimates - Mailing list pgsql-performance

From Craig Ringer
Subject Re: Performance degradation, index bloat and planner estimates
Date
Msg-id 4C998184.2070402@postnewspapers.com.au
Whole thread Raw
In response to Performance degradation, index bloat and planner estimates  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Responses Re: Performance degradation, index bloat and planner estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On 20/09/2010 7:59 PM, Daniele Varrazzo wrote:

> Does anybody have some information about where the bloat is coming
> from and what is the best way to get rid of it? Would a vacuum full
> fix this kind of problem? Is there a way to fix it without taking the
> system offline?

It's hard to know where the index bloat comes from. The usual cause I
see reported here is with regular VACUUM FULL use, which doesn't seem to
be a factor in your case.

A VACUUM FULL will not address index bloat; it's more likely to add to
it. You'd want to use CLUSTER instead, but that'll still require an
exclusive lock that takes the table offline for some time. Your current
solution - a concurrent reindex - is your best bet for a workaround
until you find out what's causing the bloat.

If the bloat issue were with relations rather than indexes I'd suspect
free space map issues as you're on 8.3.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html

My (poor) understanding is that index-only bloat probably won't be an
FSM issue.

> The indexed condition is a state of the evolution of the records in
> the table: many records assume that state for some time, then move to
> a different state no more indexed. Is the continuous addition/deletion
> of records to the index causing the bloat (which can be then
> considered limited to the indexes with a similar usage pattern)?

Personally I don't know enough to answer that. I would've expected that
proper VACUUMing would address any resulting index bloat, but

> Any idea of where the 20M record estimate is coming from? Isn't the
> size of the partial index taken into account in the estimate?

I'd really help to have EXPLAIN ANALYZE output here.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

pgsql-performance by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Query much faster with enable_seqscan=0
Next
From: Tom Lane
Date:
Subject: Re: Performance degradation, index bloat and planner estimates