Upgrade Woes - Mailing list pgsql-performance

From aturner@neteconomist.com
Subject Upgrade Woes
Date
Msg-id 20030910175340.GC5506@neteconomist.com
Whole thread Raw
Responses Re: Upgrade Woes
List pgsql-performance
Hi,

My name is Alex Turner and I work for a small Tech company in Pottstown PA.  We run Postgresql on a number of systems
fora variety of different applications, and it has been a joy to deal with all around, working fast and reliably for
over2 years. 

We recently upgraded from RedHat 7.2 to RedHat 9.0, and we are running Postgres 7.3.2 on our Proliant ML370 (Raid 1
2x1810k, and Raid 5 3x36 10k, 2x866 PIII, 2GB RAM). 

We seem to have had a serious drop after the upgrade.  The database is a database of properties that is updated on a
dailybasis, and when I say updated I mean that I insert/update the whole data download because the data provider
doesn'ttell us what changed, just gives us a complete dump. The integrity of the dumb isn't great so I can't process as
aCOPY or a block transaction because some of the data is often bad.  Each and every row is a seperate insert or update.
 
Data insert performance used to degrade in a linear fasion as time progressed I'm guessing as the transaction logs
filledup.  About once every six weeks I would dump the database, destroy and recreate the db and reload the dump.  This
'reset'the whole thing, and brought insert/vacuum times back down.  Since the upgrade, performance has degraded very
rapidlyover the first week, and then more slowly later, but enough that we now have to reload the db every 2-3 weeks.
Theinsert procedure triggers a stored procedure that updates a timestamp on the record so that we can figure out what
recordshave been touched, and which have not so that we can determine which properties have been removed from the feed
asthe record was not touched in the last two days. 

I have noticed that whilst inserts seem to be slower than before, the vacuum full doesn't seem to take as long overall.

postgresql.conf is pretty virgin, and we run postmaster with -B512 -N256 -i.  /var/lib/pgsql/data is a symlink to
/eda/data,/eda being the mount point for the Raid 5 array. 

the database isn't huge, storing about 30000 properties, and the largest table is 2.1 Million rows for property
features. The dump file is only 221MB. Alas, I did not design the schema, but I have made several 'tweaks' to it to
greatlyimprove read performance allowing us to be the fastest provider in the Tristate area.  Unfortunately the Job
startsat 01:05 (thats the earliest the dump is available) and runs until completion finishing with a vacuum full.  The
vacuumfull locks areas of the database long enough that our service is temporarily down.  At the worst point, the
vacuumfull was starting after 09:00, which our customers didn't appreciate. 

I'm wondering if there is anything I can do with postgres to allieviate this problem.  Either upgrading to 7.3.4
(althoughI tried 7.3.3 for another app, and we had to roll back to 7.3.2 because of performance problems), or working
withthe postgresql.conf to enhance performance.  I really don't want to roll back the OS version if possible, but I'm
notruling it out at this point, as that seems to be the biggest thing that has changed.  All the drive lights are
showinggreen, so I don't believe the array is running in degraded mode.  I keep logs of all the insert jobs, and
plottingaverage insert times on a graph revealed that this started at the time of the upgrade. 

Any help/suggestions would be grealy appreciated,

Thanks,

Alex Turner
NetEconomist

P.S. Sorry this is so long, but I wanted to include as much info as possible.

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Reading data in bulk - help?
Next
From: Richard Huxton
Date:
Subject: Re: Upgrade Woes