Thread: Performance of autovacuum and full vacuum of database

Performance of autovacuum and full vacuum of database

From
"Carlos Oliva"
Date:

Hi Forum,

Should autovacuum reclaim most of the free space of a database?  We are trying to configure our database and running pg_autovacuum to streamline our database.  We have increased the max_fsm_pages to a value larger than the total pages needed (see the output from a full vacuum bellow “LAST FEW LINES OF FULL VACUUM”) and turned on pg_autovacuum.

 

Nevertheless, it seems that a full vacuum that we run at night finds a lot of free space (see “EXCERPT FROM THE FULL VACUUM TO SHOW THE VACUUM OF ONE TABLE”).  I would have expected that with the configuration of our database and with autovacuum working during the day, the amount of space that a full vacuum would find would be minimal.

 

We are running pg_autovacuum with its defaults parameters.  I can see that autovacuum is working because the CPU utilization for the autovacuum PID goes up every five minutes or so and then it goes down to almost nothing.

 

LAST FEW LINES OF FULL VACUUM

INFO:  free space map: 483 relations, 219546 pages stored; 153104 total pages needed

DETAIL:  Allocated FSM size: 1000 relations + 170000 pages = 1057 kB shared memory.

 

 

EXCERPT FROM THE FULL VACUUM TO SHOW THE VACUUM OF ONE TABLE

INFO:  "en0029": found 66035 removable, 1310162 nonremovable row versions in 417

87 pages

DETAIL:  0 dead row versions cannot be removed yet.

Nonremovable row versions range from 233 to 1165 bytes long.

There were 1746 unused item pointers.

Total free space (including removable row versions) is 20825932 bytes.

1453 pages are or will become empty, including 0 at the end of the table.

2345 pages containing 16260040 free bytes are potential move destinations.

CPU 2.20s/0.22u sec elapsed 62.59 sec.

Re: Performance of autovacuum and full vacuum of database

From
"Matthew T. O'Connor"
Date:
Couple of thing here:
1) Just because autovacuum is running, doesn't mean that it has actually
tried to vacuum a table. 5 minutes is the time that it sleeps in between
investigating activity to see if a vacuum is needed. If you want to see
if pg_autovacuum has actually tried to do anything you should up the the
logging with a -d2 switch on the pg_autovacuum command line.

2) pg_autovacuum only performs normal VACUUM commands, not VACUUM FULL.
As such, there will usually be some freespace left in the table. This is
not a bad thing as normal table activity will usually result in that
free space being reused, and it can be faster when it can use existing
free space rather than having to add space to the end off a table.

Matt


Carlos Oliva wrote:
>
> Hi Forum,
>
> Should autovacuum reclaim most of the free space of a database? We are
> trying to configure our database and running pg_autovacuum to
> streamline our database. We have increased the max_fsm_pages to a
> value larger than the total pages needed (see the output from a full
> vacuum bellow “*_LAST FEW LINES OF FULL VACUUM_*”) and turned on
> pg_autovacuum.
>
> Nevertheless, it seems that a full vacuum that we run at night finds a
> lot of free space (see “*_EXCERPT FROM THE FULL VACUUM TO SHOW THE
> VACUUM OF ONE TABLE_*”). I would have expected that with the
> configuration of our database and with autovacuum working during the
> day, the amount of space that a full vacuum would find would be
> minimal.*__*
>
> We are running pg_autovacuum with its defaults parameters. I can see
> that autovacuum is working because the CPU utilization for the
> autovacuum PID goes up every five minutes or so and then it goes down
> to almost nothing.
>
> *_LAST FEW LINES OF FULL VACUUM_*
>
> INFO: free space map: 483 relations, 219546 pages stored; 153104 total
> pages needed
>
> DETAIL: Allocated FSM size: 1000 relations + 170000 pages = 1057 kB
> shared memory.
>
> *_EXCERPT FROM THE FULL VACUUM TO SHOW THE VACUUM OF ONE TABLE_*
>
> INFO: "en0029": found 66035 removable, 1310162 nonremovable row
> versions in 417
>
> 87 pages
>
> DETAIL: 0 dead row versions cannot be removed yet.
>
> Nonremovable row versions range from 233 to 1165 bytes long.
>
> There were 1746 unused item pointers.
>
> Total free space (including removable row versions) is 20825932 bytes.
>
> 1453 pages are or will become empty, including 0 at the end of the table.
>
> 2345 pages containing 16260040 free bytes are potential move destinations.
>
> CPU 2.20s/0.22u sec elapsed 62.59 sec.
>


Re: Performance of autovacuum and full vacuum of database

From
"Carlos Oliva"
Date:
Thank you for your response Matthew.  Currently I run pg_autovacuum with the
following scripts.
su -l postgres -c "pg_autovacuum -D -U postgres > /dev/null 2>&1"&

Do you suggest that I could change it to something like the following:
su -l postgres -c "pg_autovacuum -d2 -D -U postgres > /tmp/vacuum.log 2>&1"&

Thank you in advance for your response.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Matthew T. O'Connor
Sent: Thursday, November 10, 2005 2:46 PM
To: Carlos Oliva
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Performance of autovacuum and full vacuum of database

Couple of thing here:
1) Just because autovacuum is running, doesn't mean that it has actually
tried to vacuum a table. 5 minutes is the time that it sleeps in between
investigating activity to see if a vacuum is needed. If you want to see
if pg_autovacuum has actually tried to do anything you should up the the
logging with a -d2 switch on the pg_autovacuum command line.

2) pg_autovacuum only performs normal VACUUM commands, not VACUUM FULL.
As such, there will usually be some freespace left in the table. This is
not a bad thing as normal table activity will usually result in that
free space being reused, and it can be faster when it can use existing
free space rather than having to add space to the end off a table.

Matt


Carlos Oliva wrote:
>
> Hi Forum,
>
> Should autovacuum reclaim most of the free space of a database? We are
> trying to configure our database and running pg_autovacuum to
> streamline our database. We have increased the max_fsm_pages to a
> value larger than the total pages needed (see the output from a full
> vacuum bellow "*_LAST FEW LINES OF FULL VACUUM_*") and turned on
> pg_autovacuum.
>
> Nevertheless, it seems that a full vacuum that we run at night finds a
> lot of free space (see "*_EXCERPT FROM THE FULL VACUUM TO SHOW THE
> VACUUM OF ONE TABLE_*"). I would have expected that with the
> configuration of our database and with autovacuum working during the
> day, the amount of space that a full vacuum would find would be
> minimal.*__*
>
> We are running pg_autovacuum with its defaults parameters. I can see
> that autovacuum is working because the CPU utilization for the
> autovacuum PID goes up every five minutes or so and then it goes down
> to almost nothing.
>
> *_LAST FEW LINES OF FULL VACUUM_*
>
> INFO: free space map: 483 relations, 219546 pages stored; 153104 total
> pages needed
>
> DETAIL: Allocated FSM size: 1000 relations + 170000 pages = 1057 kB
> shared memory.
>
> *_EXCERPT FROM THE FULL VACUUM TO SHOW THE VACUUM OF ONE TABLE_*
>
> INFO: "en0029": found 66035 removable, 1310162 nonremovable row
> versions in 417
>
> 87 pages
>
> DETAIL: 0 dead row versions cannot be removed yet.
>
> Nonremovable row versions range from 233 to 1165 bytes long.
>
> There were 1746 unused item pointers.
>
> Total free space (including removable row versions) is 20825932 bytes.
>
> 1453 pages are or will become empty, including 0 at the end of the table.
>
> 2345 pages containing 16260040 free bytes are potential move destinations.
>
> CPU 2.20s/0.22u sec elapsed 62.59 sec.
>


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings



Re: Performance of autovacuum and full vacuum of database

From
"Matthew T. O'Connor"
Date:
Carlos Oliva wrote:
> Thank you for your response Matthew.  Currently I run pg_autovacuum with the
> following scripts.
> su -l postgres -c "pg_autovacuum -D -U postgres > /dev/null 2>&1"&
>
> Do you suggest that I could change it to something like the following:
> su -l postgres -c "pg_autovacuum -d2 -D -U postgres > /tmp/vacuum.log 2>&1"&

Yes exactly, and if you find that pg_autovacuum is never or not often
enough firing off vacuum comands, then you will need to play with the
threshold settings.  The default thresholds for pg_autovacuum are too
conservative for most people, so you may very well have to do this.


Re: Performance of autovacuum and full vacuum of database

From
Vivek Khera
Date:
On Nov 10, 2005, at 3:43 PM, Matthew T. O'Connor wrote:

> Yes exactly, and if you find that pg_autovacuum is never or not
> often enough firing off vacuum comands, then you will need to play
> with the threshold settings.  The default thresholds for
> pg_autovacuum are too conservative for most people, so you may very
> well have to do this.

Another issue with autovacuum (haven't investigated the 8.1 version
yet) is that you can't make different threshhold settings for
different tables.  For example, I have some tables that are a handful
of rows but are updated bazillions of times per day, and other tables
with 10's of millions of rows updated perhaps 100k times per day.
autovacuum would go several days without vacuuming some of the
tables, which was just bad...

I don't think it is suitable for all situations.


Re: Performance of autovacuum and full vacuum of database

From
"Matthew T. O'Connor"
Date:
Vivek Khera wrote:
> Another issue with autovacuum (haven't investigated the 8.1 version
> yet) is that you can't make different threshhold settings for
> different tables.  For example, I have some tables that are a handful
> of rows but are updated bazillions of times per day, and other tables
> with 10's of millions of rows updated perhaps 100k times per day.
> autovacuum would go several days without vacuuming some of the tables,
> which was just bad...   I don't think it is suitable for all situations.

With the release of 8.1 and it's integrated version of autovacuum, you
can now set per table settings for for vauum and analyze thresholds,
vacuum cost delay, and table enable / disable.  This addresses what was
probably the largest deficiency with the old contrib version.


Re: Performance of autovacuum and full vacuum of database

From
Csaba Nagy
Date:
This is also true in my situation, where I have some medium sized
tables, which have a always just a handful of rows heavily updated. The
amount of updates is not too big related to the size of the table, but
the repeated update of the same row will cause problems before
autovacuum will kick in, as I understood. In this case I would need a
far smaller threshold for this table as for others.

I wonder how hard would it be to implement a rule system for autovacuum,
maybe kept in it's dedicated "pg_" table ? I would even volunteer to
write this if somebody would point me in the right direction, even if
I'm not a C guy and not too familiar with postgres internals.

Another possibility would be to integrate the vacuum rules into the
creation of the table. The info would then go to the meta-data of the
table. It would also allow to dynamically change it from a SQL prompt
versus fiddling with config files.

Does this sound reasonable ? As I said, it's important enough for me to
allocate some time to implement it, but I need assistance, as I'm a Java
programmer and have no experience with C or postgres code (other than
occasionally hacking some already existing code to make it work as I
want, postgres included - for a long time we've run a patched version
where the foreign key checks did not lock the parent row exclusively,
now I can throw away that patch thanks to 8.1).

Cheers,
Csaba.

On Fri, 2005-11-11 at 17:04, Vivek Khera wrote:
> On Nov 10, 2005, at 3:43 PM, Matthew T. O'Connor wrote:
>
> > Yes exactly, and if you find that pg_autovacuum is never or not
> > often enough firing off vacuum comands, then you will need to play
> > with the threshold settings.  The default thresholds for
> > pg_autovacuum are too conservative for most people, so you may very
> > well have to do this.
>
> Another issue with autovacuum (haven't investigated the 8.1 version
> yet) is that you can't make different threshhold settings for
> different tables.  For example, I have some tables that are a handful
> of rows but are updated bazillions of times per day, and other tables
> with 10's of millions of rows updated perhaps 100k times per day.
> autovacuum would go several days without vacuuming some of the
> tables, which was just bad...
>
> I don't think it is suitable for all situations.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend


Re: Performance of autovacuum and full vacuum of database

From
Csaba Nagy
Date:
[snip]
> With the release of 8.1 and it's integrated version of autovacuum, you
> can now set per table settings for for vauum and analyze thresholds,
> vacuum cost delay, and table enable / disable.  This addresses what was
> probably the largest deficiency with the old contrib version.

Cool ! I didn't know that... I just wanted to volunteer to implement
that :-D

But I can't see where can you set the per table settings ? In the config
file I didn't find any reference to this...

Cheers,
Csaba.




Re: Performance of autovacuum and full vacuum of database

From
Csaba Nagy
Date:
OK, I've found it:
http://www.postgresql.org/docs/8.1/interactive/catalog-pg-autovacuum.html

One more incentive to upgrade as quickly as possible.

Cheers,
Csaba.

On Fri, 2005-11-11 at 17:40, Csaba Nagy wrote:
> [snip]
> > With the release of 8.1 and it's integrated version of autovacuum, you
> > can now set per table settings for for vauum and analyze thresholds,
> > vacuum cost delay, and table enable / disable.  This addresses what was
> > probably the largest deficiency with the old contrib version.
>
> Cool ! I didn't know that... I just wanted to volunteer to implement
> that :-D
>
> But I can't see where can you set the per table settings ? In the config
> file I didn't find any reference to this...
>
> Cheers,
> Csaba.
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq


Re: Performance of autovacuum and full vacuum of database

From
"Carlos Oliva"
Date:
Hi Matthew,
Would rebooting the server interfere with the work of pg_autovacuum?  I
imagine that pg_autovacuum would loose the information that it gathered
prior to the reboot.

Would not pg_autovacuum need to gather data about the database again before
it can judge if it should analyze/vacuum a table of the database?

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Matthew T. O'Connor
Sent: Thursday, November 10, 2005 3:44 PM
To: Carlos Oliva
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Performance of autovacuum and full vacuum of database

Carlos Oliva wrote:
> Thank you for your response Matthew.  Currently I run pg_autovacuum with
the
> following scripts.
> su -l postgres -c "pg_autovacuum -D -U postgres > /dev/null 2>&1"&
>
> Do you suggest that I could change it to something like the following:
> su -l postgres -c "pg_autovacuum -d2 -D -U postgres > /tmp/vacuum.log
2>&1"&

Yes exactly, and if you find that pg_autovacuum is never or not often
enough firing off vacuum comands, then you will need to play with the
threshold settings.  The default thresholds for pg_autovacuum are too
conservative for most people, so you may very well have to do this.


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match



Re: Performance of autovacuum and full vacuum of database

From
Tom Lane
Date:
"Carlos Oliva" <carlos@pbsinet.com> writes:
> Would rebooting the server interfere with the work of pg_autovacuum?  I
> imagine that pg_autovacuum would loose the information that it gathered
> prior to the reboot.

The only long-term state used by autovacuum is the contents of the
statistics views, which will be saved across reboot if you have the
relevant postgresql.conf parameter set properly.  (As of 8.1 it is set
to save by default, but prior releases had the opposite default :-()

At least this is true for the 8.1 integrated autovacuum, I'm less
certain about the contrib version in prior releases...

            regards, tom lane