Thread: Vacuum Full is *hideously* slow!

Vacuum Full is *hideously* slow!

From
"Phoenix Kiula"
Date:
Hi.

Per this thread:
http://archives.postgresql.org/pgsql-general/2008-11/msg00608.php -- I
think I understood that the time had come for my db to have a VACUUM
FULL. (Regular autovacuum etc is working)

I know a full vacuum is slow. A simple google search had suggested so.

But I had no idea it would take HOURS! I started the process against a
5GB database (about 8.5 million rows in all) and it went on for more
than an hour. I had to kill the process.

I am now reindexing just to be sure.

Is this normal? If a vacuum full takes hours or even days then what's
the point?

I read here - http://archives.postgresql.org/pgsql-performance/2005-07/msg00375.php
- that it's better to drop the indices, then vacuum, and then recreate
the indices. Is this true? This is also a bad decision for production
servers, but would this be better?

Thanks!

Re: Vacuum Full is *hideously* slow!

From
"Scott Marlowe"
Date:
On Sat, Nov 15, 2008 at 2:36 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> Hi.
>
> Per this thread:
> http://archives.postgresql.org/pgsql-general/2008-11/msg00608.php -- I
> think I understood that the time had come for my db to have a VACUUM
> FULL. (Regular autovacuum etc is working)
>
> I know a full vacuum is slow. A simple google search had suggested so.
>
> But I had no idea it would take HOURS! I started the process against a
> 5GB database (about 8.5 million rows in all) and it went on for more
> than an hour. I had to kill the process.

Your machine must have a pretty slow I/O subsystem.  Recently our main
production database, which is in the 25Gig range, had a few rogue
queries (updates without where clauses, gotta love'em!) that bloated a
few tables from 300,000 rows to 300,000 live and about 10,000,000 dead
rows.  We took our system down late at night for maintenance and
vacuum fulled then reindexed the whole thing, and it took
approximately 20 minutes to run.

Note that you really should take your app offline if you're vacuuming
full any tables, since it locks them anyway, and there's no reason for
users to sit and stare at an hour glass waiting for a response that
isn't coming for quite some time, as well as producing more load on a
db that's already working pretty hard.

> I am now reindexing just to be sure.

Always a good idea after a vacuum full, as a vacuum full can bloat indexes.

> Is this normal? If a vacuum full takes hours or even days then what's
> the point?

Define Normal (sorry, that's a favorite book of mine, I really couldn't resist).

Normal depends a lot on your server and the size of the tables being
vacuumed.  Since my main db servers have 12 spindles under the DB and
8 cores to run on, I can run vacuum fulls on several tables at once
and get the time down to about 5 to 8 minutes for the whole db.  If
I'm in a hurry.  I'm usually not.

If you're running on a single 7200RPM SATA drive don't expect it to
complete a vacuum on a good sized dataset as fast as a lot of drives
under a fast RAID controller.  Vacuum is VERY I/O dependent.

> I read here - http://archives.postgresql.org/pgsql-performance/2005-07/msg00375.php
> - that it's better to drop the indices, then vacuum, and then recreate
> the indices. Is this true? This is also a bad decision for production
> servers, but would this be better?

Yep, it is.  Since the indexes are likely to get bloated anyway, you
might as well just drop them and recreate them. OR cluster your tables
on your favorite index, which will all free all the dead space plus it
makes the table faster for things ordered by that index.

The real question is what led to you needing vacuum full instead of
regular vacuum, and what are you doing to prevent it in the future.

Re: Vacuum Full is *hideously* slow!

From
"Phoenix Kiula"
Date:
Thanks Scott. Could be my I/O as I am on 15k Raptor SATA drives with
RAID 1 only.

Anyway, reindexing happened fast (12 minutes) and things are now humming along.

How long should cluster take on a db that's about 5.5GB in size? Is it
worth doing on a production db? The db is running fast now anyway, so
the only reason I am even thinking about this is because of the
results of a VACUUM ANALYZE, which give me this:

---
INFO:  "links": found 5427 removable, 8876599 nonremovable row
versions in 483102 pages
DETAIL:  96 dead row versions cannot be removed yet.
There were 2135991 unused item pointers.
235 pages contain useful free space.
---


My semi-intelligent question: in the output above, that figure of
"483,102" pages. Should this in any way affect my setting for
max_fsm_pages variable in pg.conf?

Re: Vacuum Full is *hideously* slow!

From
Martijn van Oosterhout
Date:
On Sun, Nov 16, 2008 at 06:05:40AM +0800, Phoenix Kiula wrote:
> How long should cluster take on a db that's about 5.5GB in size? Is it
> worth doing on a production db? The db is running fast now anyway, so
> the only reason I am even thinking about this is because of the
> results of a VACUUM ANALYZE, which give me this:

How long it takes to cluster depends on how many rows there are in the
table and how ordered they are. How long a VACUUM FULL takes depends on
how big the tables is and how many gaps there are to fill. If you have
a big table (in actual space) but it only has a few rows (it's very
bloated) a cluter will will all the time.

> ---
> INFO:  "links": found 5427 removable, 8876599 nonremovable row
> versions in 483102 pages
> DETAIL:  96 dead row versions cannot be removed yet.
> There were 2135991 unused item pointers.
> 235 pages contain useful free space.
> ---

This table looks fine. Only 235 pages have useful free space. The
483102 pages just means the table is 483102*8K = 3.9GB in size. I
imagine you max_fsm_pages is somewhat larger than 235 pages.

If yo uwant proper estimates for max_fsm_pages you should vacuum the
entire DB, then it'll tell you.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

disable/enable trigger and transaction

From
Peter Billen
Date:
Hi all,

Is the instruction 'ALTER TABLE x ENABLE/DISABLE TRIGGER y' part of the
current transaction? That is, say that there are two ongoing
transactions, T1 and T2. Is the trigger y still enabled in transaction
T2 if it's disabled in transaction T1?

Thanks in advance. Kind regards,

Peter

Re: disable/enable trigger and transaction

From
"Grzegorz Jaśkiewicz"
Date:
On Sun, Nov 16, 2008 at 12:39 PM, Peter Billen <peter@clueless.be> wrote:
Hi all,

Is the instruction 'ALTER TABLE x ENABLE/DISABLE TRIGGER y' part of the current transaction? That is, say that there are two ongoing transactions, T1 and T2. Is the trigger y still enabled in transaction T2 if it's disabled in transaction T1?
yes, but this will lock table operations in other transactions that would otherwise use that trigger. So be careful with it.