Thread: Indexes and Tables: Growth and Treatment (Modified by Thomas F. O'Connell)

Indexes and Tables: Growth and Treatment (Modified by Thomas F. O'Connell)

From
Thomas F.O'Connell
Date:
I'm helping manage a postgres installation that continually consumes a
considerable amount of disk space, and I'm hoping to learn a bit more
about both treating the symptoms and addressing the causes.

Here are the basics:

It's a pg 7.4.1 installation on a Debian stable GNU/Linux 2.6.2 box
with 4GB RAM with 4 2.4 GHz processors and 36 GB of disk space.

There are thousands of tables, many of which are object-relational
(I.e., many are subclasses of sets of top-level tables). There are
indexes in place for joins that apply to many of the columns in the
subclassed tables.

It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis.

We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes. We began
manually reindexing the worst offenders once we passed 50% disk usage
regularly.

So here are my questions:

1. Is adding reindexdb to cron to reindex the entire database nightly
overkill?

2. If we turn on pg_autovacuum and leave in place one weekly vacuum
full, is that a reasonable strategy?

3. Otherwise, is it better in general to vacuum prior to reindexing?

4. What are the best places to look for causes of the velocity of
growth?

Thanks!

-tfo


"Thomas F.O'Connell" <tfo@sitening.com> writes:
> It's a high turnover database, in that the applications that use it
> perform thousands of inserts, updates, and deletes on a daily basis.

> We're seeing about 5-10 GB of increased disk space used on a daily
> basis if a vacuum (full) or reindexdb is not performed. We were doing
> one vacuum analyze full a week with nightly vacuum analyzes.

Try hourly vacuums.  If that doesn't stem the tide, make it more often
(or try autovacuum).  Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.

With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.

I can't recommend an analyze frequency on what you've told us.

            regards, tom lane

Re: Indexes and Tables: Growth and Treatment

From
Thomas F.O'Connell
Date:
On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:

> Try hourly vacuums.  If that doesn't stem the tide, make it more often
> (or try autovacuum).

I will try autovacuum.

> Also make sure that your FSM settings are large
> enough; if they're not then no amount of plain vacuuming will keep you
> out of trouble.

I was just reading up on FSM settings today. In fact, here's the output
of a recent VACUUM VERBOSE:

INFO:  free space map: 1000 relations, 11599 pages stored; 100064 total
pages needed
DETAIL:  Allocated FSM size: 1000 relations + 20000 pages = 178 kB
shared memory.

So clearly we need to increase max_fsm_pages. How is this related to
vacuuming? And is it related at all to index growth?

> With sufficiently frequent plain vacuums you really shouldn't need
> vacuum full at all.

So is the only benefit to that the extreme optimizations of disk space
it undertakes? Is there any point at which the extra compacting
actually results in a performance enhancement?

> I can't recommend an analyze frequency on what you've told us.

What more information would you need to make a recommendation?

Thanks for all the tips!

-tfo


Re: Indexes and Tables: Growth and Treatment

From
Thomas F.O'Connell
Date:
Tom,

If I've got the RAM, should I have max_fsm_relations be large enough to
cover _all_ user tables and indexes?

Thanks!

-tfo

On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:

> Try hourly vacuums.  If that doesn't stem the tide, make it more often
> (or try autovacuum).  Also make sure that your FSM settings are large
> enough; if they're not then no amount of plain vacuuming will keep you
> out of trouble.