Postgres performance slowly gets worse over a month - Mailing list pgsql-admin

Back on July 23, I posted on our performance problem.  At that time, I
got several suggestions about what to do to try to fix it.  Well, it's
happening again...

To recap, we have a web based application that utilizes a postgres
backend for handling orders and scheduling.  Gradually, the average load
on the system climbs until we dump the database, drop it and reload it
from the dumps.  This occurs slowly (weeks, rather than hours) and will
eventually get so bad that no work can get done.  Another point is that
the progression of performance appears to be geometric, rather than
linear.  I also notice that the size of the data directory starts at
about 4.5 Gig and climbs throughout this process.  It's currently at
6.3Gig.

We do a full vacuum every night.  We have adjusted max_fsm_pages to
1000000 and max_fsm_relations to 10000 (we were seeing deleted tuples in
the 50K range on some of our tables).  We are using ADODB 2.12 in PHP on
Apache 1.3.26 to access the database.  We've tried rebuilding all of the
indexes and that didn't help.  We're going to try it again because we
wound up corrupting our indexes in the system tables while trying to
drop and recreate a table that got partially created before a system
crash.  We suspect that we may have had other problems at the time.  We
have a contractor that did the reindex for us so I'm not sure of the
process at this point.  I suspect that it's something like;
1. shut down postgres
2. run 'postgres -O -P' to start a single user instantiation of the
engine
3. type 'reindex' to get it to do it
4. exit postgres and restart the DB with pg_ctl

Does that sound about right?

Any other suggestions?  We're looking to do something soon before the
load gets out of hand.  We have done full DB dumps and restores to fix
this in the past and it takes 4-5 hours in the middle of the night since
we can't take the system down during the day 'cuz we have about 1100
people sitting on their hands when we do it.  One of the suggestions was
to use 'pgmonitor' to keep watch over what's happening but transaction
go through too quickly and nothing seems to tie the system up for any
period of time.  We also notice that while the load keeps increasing,
the actual CPU time is very small.  We'll see loads above 3 with each
CPU (we have two) sitting with 80%+ idle time.

As a recap, this is happening on a Compaq Proliant 3500 system with a
five disk raid5 in hardware.

--
Robert M. Meyer
Sr. Network Administrator
DigiVision Satellite Services
14 Lafayette Sq, Ste 410
Buffalo, NY 14203-1904
(716)332-1451


pgsql-admin by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: Problems with migration
Next
From: Tim Ellis
Date:
Subject: Re: Postgres performance slowly gets worse over a month