Maintenance question / DB size anomaly... - Mailing list pgsql-performance

From Kurt Overberg
Subject Maintenance question / DB size anomaly...
Date
Msg-id A6D35557-74A3-47CF-B426-143C8CE01A13@hotdogrecords.com
Whole thread Raw
Responses Re: Maintenance question / DB size anomaly...  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Maintenance question / DB size anomaly...  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
Gang,

Hoping you all can help me with a rather bizarre issue that I've run
across.  I don't really need a solution, I think I have one, but I'd
really like to run it by everyone in case I'm headed in the wrong
direction.

I'm running a small Slony (v1.1.5)/postgresql 8.0.4 cluster (on
RedHat) that contains one master database, and two slaves.  The db1
(the master) has been up for about 1.5 years, db2 (slave 1) for about
9 months, and db3 (second slave) for about two months.  I do a VACUUM
ANALYZE every morning on all three databases.  However, the vacuum on
db1 takes approxiamately 4.5 hours, and on the slaves it takes about
1/2 hour.  As far as I can tell, my FSM settings are correct.  This
is concerning because the vacuum on db1 is starting to run into
production hours.  The master receives all inserts, updates and
deletes (as well as a fair number of selects).  The slaves are select-
only.

In my investigation of this anomaly, I noticed that the data/ dir on
db1 (the master) is around 60 Gigs.  The data directory on the slaves
is around 25Gb.  After about 3 months of head scratching, someone on
the irc channel suggested that it may be due to index bloat.
Although, doing some research, it would seem that those problems were
resolved in 7.4(ish), and it wouldn't account for one database being
2.5x bigger.  Another unknown is Slony overhead (both in size and
vacuum times).

The ONLY thing I can think of is that I DROPped a large number of
tables from db1 a few months ago (they weren't getting replicated).
This is on the order of 1700+ fairly largeish (50,000+ row) tables.
I do not remember doing a vacuum full after dropping them, so perhaps
that's my problem.  I'm planning on doing some maintenance this
weekend, during which I will take the whole system down, then on db1,
run a VACUUM FULL ANALYZE on the whole database, then a REINDEX on my
very large tables.  I may drop and recreate the indexes on my big
tables, as I hear that may be faster than a REINDEX.  I will probably
run a VACUUM FULL ANALYZE on the slaves as well.

Thoughts?  Suggestions?  Anyone think this will actually help my
problem of size and vacuum times?  Do I need to take Slony down while
I do this?  Will the VACUUM FULL table locking interfere with Slony?

Thanks for any light you all can shed on these issues...

/kurt


pgsql-performance by date:

Previous
From: Mark Lewis
Date:
Subject: Re: Performance query about large tables, lots of concurrent access
Next
From: Karl Wright
Date:
Subject: Re: Performance query about large tables, lots of concurrent access