Re: How often do I need to reindex tables? - Mailing list pgsql-general

From Bill Moran
Subject Re: How often do I need to reindex tables?
Date
Msg-id 20070419093314.434c2f08.wmoran@collaborativefusion.com
Whole thread Raw
In response to Re: How often do I need to reindex tables?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
In response to Tom Lane <tgl@sss.pgh.pa.us>:

> Bill Moran <wmoran@collaborativefusion.com> writes:
> > Just an FYI ... I remembered what prompted the cron job.
>
> > We were seeing significant performance degradation.  I never did actual
> > measurements, but it was on the order of "Bill, why is restoring taking
> > such a long time?" from other systems people.  At the time, I poked around
> > and tried some stuff here and there and found that reindex restored
> > performance.  I didn't look at actual size at that time.
>
> A reindex might improve performance for reasons other than bloat --- to
> wit, that a freshly-built index is in perfect physical order, which
> tends to get degraded over time by page splits.  How important that is
> depends on your usage patterns.  If this is what the story is for your
> situation, then what might fix it (in 8.2) is to create the index with
> FILLFACTOR 50 or so, so that it's already at the steady state density
> and won't need many page splits.
>
> > Anyway, I'll report back in a few weeks as to what the numbers look like.
>
> Yeah, please for the moment just watch what happens with the default
> behavior.

Remember this discussion?

To recap, I had scheduled a weekly reindex of this database because I
was seeing performance issues otherwise.  In order to see if this was
actually helping, I disabled the redindex job, ran a few timing
experiments, then scheduled a job to email me the size of the indexes
in the database on a daily basis.

At this point, I have daily records of index size since March 6th.

The behaviour is like this:  A freshly created index is about 21,000
pages in size.  Under normal usage, the index size balloons to about
38,000 pages immediately after the first backup job is run.  From there
it grows slowly (but fairly consistently) by about 100 pages each day.
As of today, it is 44304 pages.

When I first brought up this discussion, the table contained 8068956
rows.  It now has 7451381, which means it's dropped by 7%

The important part is that I can't reproduce the performance problems
that I originally thought were the result of this.  It's entirely
possible that something else was changed since then that actually
fixed the problem, and that the index bloat was a red herring.

Not sure what (if any) conclusions can be drawn from this.  Is there
any other data I should gather?  Have I just proved my previous
rantings about the necessity of reindexing to be wrong?

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

pgsql-general by date:

Previous
From: Dave Page
Date:
Subject: Re: Auditing a database
Next
From: "Martin Gainty"
Date:
Subject: Re: How often do I need to reindex tables?