Re: Catalog bloat (again) - Mailing list pgsql-general

From Bill Moran
Subject Re: Catalog bloat (again)
Date
Msg-id 20160127181339.3baa63511b7c82b649b3b7b1@potentialtech.com
Whole thread Raw
In response to Catalog bloat (again)  (Ivan Voras <ivoras@gmail.com>)
Responses Re: Catalog bloat (again)  (Ivan Voras <ivoras@gmail.com>)
Re: Catalog bloat (again)  (Ivan Voras <ivoras@gmail.com>)
List pgsql-general
On Wed, 27 Jan 2016 23:54:37 +0100
Ivan Voras <ivoras@gmail.com> wrote:
>
> I've done my Googling, and it looks like this is a fairly common problem.
> In my case, there's a collection of hundreds of databases (10 GB+) with
> apps which are pretty much designed (a long time ago) with heavy use of
> temp tables - so a non-trivial system.
>
> The databases are vacuumed (not-full) daily, from cron

Vacuuming once a day is seldom often enough, except on very quiet
databases.

> (autovacuum was
> turned off some time ago for performance reasons), and still their size
> increases unexpectedly. By using some of the queries floating around on the
> wiki and stackoverflow[*], I've discovered that the bloat is not, as was
> assumed, in the user tables, but in the system tables, mostly in
> pg_attributes and pg_class.

The size increase isn't really unexpected. If you're only vacuuming once
per day, it's very easy for activity to cause active tables to bloat quite
a bit.

> This is becoming a serious problem, as I've seen instances of these tables
> grow to 6 GB+ (on a 15 GB total database), while still effectively
> containing on the order of 10.000 records or so. This is quite abnormal.
>
> For blocking reasons, we'd like to avoid vacuum fulls on these tables (as
> it seems like touching them will lock up everything else).

It will. But to get them back down to a reasonable size, you're going to
have to do a VACUUM FULL at least _once_. If you retune things correctly,
you shouldn't need any more FULLs after that 1 time.

> So, question #1: WTF? How could this happen, on a regularly vacuumed
> system? Shouldn't the space be reused, at least after a VACUUM? The issue
> here is not the absolute existence of the bloat space, it's that it's
> constantly growing for *system* tables.

With a lot of activity, once a day probably isn't regular enough.

> Question #2: What can be done about it?

I highly recommend turning autovacuum back on, then tweaking the autovacuum
parameters to prevent any preformance issues.

However, if you're dead set against autovacuum, find out (using the queries
that are available all over the internet) which tables are bloating the
worst, and schedule additional vacuums via cron that vacuum _only_ the
problem tables. How often is something that will require some guesswork
and/or experimenting, but I would recommend at least once per hour. Since
you're only vacuuming selected tables, the performance impact should be
minimal.

You'll have to do a VACUUM FULL on the bloated tables _once_ to get the size
back down, but if you pick a good schedule or use autovacuum with appropriate
settings, they shouldn't need a VACUUM FULL again after that.

--
Bill Moran


pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Catalog bloat (again)
Next
From: Dane Foster
Date:
Subject: Re: A contradiction in 13.2.1