Thread: data directory growing huge

data directory growing huge

From
"Esger Abbink"
Date:
Hello,

it is very possible that this is a well described problem, but I have
not been able to find the solution.

On two production server (7.2rc2) of ours the data directory is growing
to very large sizes while the data that is actually in the db's isnt 1.
that large and 2. growing.

The databases see a fairly limited/constant use at the moment. The data
they contain is close to real-time information and as such they are
continuously updated with new data. The amount of queries executed is
low or very low (this will change in the future).

To prevent the database from growing to unmanageable size it's being
emptied and vacuumed twice a day. This is only a "vacuum" btw, not a
"vacuum analyze". I did an analyze by hand to get the planner to use a
plan suited for 100k+ tables as before that performance was horrid.

The problem is that the data directories have reached sizes of 4.1 and
4.5 Gigabyte respectively. A recreation of the 4.1Gb db after a pg_dump
on a different system yields a total of about 460Mb in ALL databases on
that system, the dump file itself is 10Mb. Clearly the 4Gb is a bit
wastefull...

Next to the daily automatic vacuum's I have done a "vacuum full" by hand
(this took a few hours) but this has not resulted in any reduction in
file sizes.

the bulk of the space is occupied by a dozen huge files in base/16559,
all sufficiently meaninglessly named: 72646, 72653.1 etc.

As vacuuming doesnt help, whats up? Could it be that there are lots of
stuck transactions keeping huge amounts of old&removed rows on disk
through MVCC or something?

thanks for any help,

Esger

--
NeoMail - Webmail that doesn't suck... as much.
http://neomail.sourceforge.net

Re: data directory growing huge

From
Shridhar Daithankar
Date:
Esger Abbink wrote:

> The databases see a fairly limited/constant use at the moment. The data
> they contain is close to real-time information and as such they are
> continuously updated with new data. The amount of queries executed is
> low or very low (this will change in the future).

No wonder data is growing real fast. If it is updated in real time it is going
to generate dead tuples like crazy..

> The problem is that the data directories have reached sizes of 4.1 and
> 4.5 Gigabyte respectively. A recreation of the 4.1Gb db after a pg_dump
> on a different system yields a total of about 460Mb in ALL databases on
> that system, the dump file itself is 10Mb. Clearly the 4Gb is a bit
> wastefull...
>
> Next to the daily automatic vacuum's I have done a "vacuum full" by hand
> (this took a few hours) but this has not resulted in any reduction in
> file sizes.

Hmm.. Vacuum full should have done the trick..

> the bulk of the space is occupied by a dozen huge files in base/16559,
> all sufficiently meaninglessly named: 72646, 72653.1 etc.

You can find the corresponding table names by querying pg_class..

> As vacuuming doesnt help, whats up? Could it be that there are lots of
> stuck transactions keeping huge amounts of old&removed rows on disk
> through MVCC or something?

If you can take down the database for some time, you can dump the entire
database using pg_dump, drop the database and recreate from the dump. Surely it
would take lot less than few hours.

And I think you despartely need pg_autovacuum.. A daemon that vacuums in real
time according to requirement. It's in CVS head and works with 7.3/7.4. Up your
max fsm setting appropriately too.. That should help keeping your database size
in check..

  HTH

  Shridhar



Re: data directory growing huge

From
Doug McNaught
Date:
"Esger Abbink" <pggeneral@bumblebeast.com> writes:

> On two production server (7.2rc2) of ours the data directory is growing
> to very large sizes while the data that is actually in the db's isnt 1.
> that large and 2. growing.

You're running a release candidate of 7.2?  That's pretty old and has
known bugs...

Your problem, however, is probably index bloat, which is not fixed by
VACUUM in the version you're running.  If you REINDEX you should see
storage usage go back to normal.

The upcoming 7.4 should suffer much less from the index bloat problem.

-Doug

Re: data directory growing huge

From
Tom Lane
Date:
"Esger Abbink" <pggeneral@bumblebeast.com> writes:
> the bulk of the space is occupied by a dozen huge files in base/16559,
> all sufficiently meaninglessly named: 72646, 72653.1 etc.

You could find out what they are by consulting pg_class.relfilenode
(or see the contrib/oid2name utility).  But my bet is that they are
indexes and you need to REINDEX them to recover the space.

> On two production server (7.2rc2) of ours

[ blanch ]  You're running a production server on 7.2rc2?  Still?
Get thee to 7.2.4, at least.  We do not make dot-releases for idle
amusement.

            regards, tom lane

Re: data directory growing huge

From
"Esger Abbink"
Date:
Hi,

thanks for the tip(s), reindex did the trick wonderfully.


>
> > On two production server (7.2rc2) of ours
>
> [ blanch ]  You're running a production server on 7.2rc2?  Still?
> Get thee to 7.2.4, at least.  We do not make dot-releases for idle
> amusement.
>

not sure what "blanch" means but I can imagine ;)

It's not so bad really. Although these are production servers they dont
actually *DO* that much in the sense that anyone is actually looking at
the data they hold :) So it's not really a priority for us and they have
been giving us no grief at all, except for this.

The usage will change soon though, and at that point I will probably be
upgrading them to the latest stable version.

It certainly looks like you guys havent been idling while I was away not
reading this list :)

thanks,

Esger

--
NeoMail - Webmail that doesn't suck... as much.
http://neomail.sourceforge.net