Re: Running vacuumdb -a taking too long - Mailing list pgsql-general

From Bill Moran
Subject Re: Running vacuumdb -a taking too long
Date
Msg-id 20090727154115.3744df77.wmoran@potentialtech.com
Whole thread Raw
In response to Running vacuumdb -a taking too long  (Keaton Adams <kadams@mxlogic.com>)
Responses Re: Running vacuumdb -a taking too long
List pgsql-general
In response to Keaton Adams <kadams@mxlogic.com>:

> We are to a point in size and utilization of a set of our Postgres 8.1.17 databases that a vacuumdb -a -z -U postgres
isstill running after 50 hours and we have to kill it off because the additional i/o is causing queries to stack up.
Wehave archived off as much data as possible (100+ GB) and we are looking at moving to a better hardware configuration
tosupport the increased system activity, but for now I need to know if there is a way to avoid "transaction ID
wraparound"if we don't issue a vacuumdb -a to update the datfrozenxid value. 
>
> This is a logging type database where data is initially loaded and then "rolled up" into weekly/monthly tables.  Many
ofthe tables in the DB don't last for more than seven days, and after a week the majority of the historical tables are
staticand stay around until they are finally dropped off (based on a retention period), so re-running a vacuum on these
olderstatic tables doesn't really gain much since the data never changes. 
>
> I have read (and re-read) the documentation and am  a bit confused on exactly what needs to happen here:
>
> Since periodic VACUUM runs are needed anyway for the reasons described earlier, it's unlikely that any table would
notbe vacuumed for as long as a billion transactions. But to help administrators ensure this constraint is met, VACUUM
storestransaction ID statistics in the system table pg_database. In particular, the datfrozenxid column of a database's
pg_databaserow is updated at the completion of any database-wide VACUUM operation (i.e., VACUUM that does not name a
specifictable). The value stored in this field is the freeze cutoff XID that was used by that VACUUM command. All
normalXIDs older than this cutoff XID are guaranteed to have been replaced by FrozenXID within that database. A
convenientway to examine this information is to execute the query 
>
> SELECT datname, age(datfrozenxid) FROM pg_database;
>
> The age column measures the number of transactions from the cutoff XID to the current transaction's XID.
>
> So if, after a table is no longer added to (becomes static), I run a VACUUM against it, the table wouldn't need to be
vacuumedagain since the tuples and their related transaction ID's never change?  Is there a way to set up a vacuum
schemeon a table-by-table basis to accomplish the same goal as an all-in-one vacuumdb -a run that I can spread out over
timeinstead of relying on the completion of a single vacuumdb -a command? 

I'm not quite sure what the best answer is to your problem, but since
nobody else has suggested anything, here are my thoughts.

First off, can you allocate more maintenance_work_mem?  Perhaps that can
speed up vacuum enough.

Secondly, if you VACUUM FREEZE those static tables, it will guarantee
that you'll never lose data from them, even if you hit XID wraparound.
If you VACUUM FREEZE tables one at a time, perhaps you can avoid the
huge performance hit.

Third, while your argument about tables not needing vacuumed again seems
logical, it's simply not how PG functions.  Since the XID is database-wide,
it may affect any table.  Of course, the FREEZE process will protect tables
from this.  This seems to be improved in newer versions of Postgres, so
an upgrade should improve the issue.

Finally, are you really in danger of hitting the wraparound?  If you run
the query "SELECT datname, age(datfrozenxid) FROM pg_database;" (as suggested
in the docs) once a day for a few days, does it seems like you're using
up XIDs fast enough to be a danger?  If you've got new hardware coming
soon anyway, perhaps you have enough time to now worry about it on the
current hardware?

Hope this helps.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Relational Algebra and Aggregate Functions
Next
From: Phoenix Kiula
Date:
Subject: For production: 8.4 or 8.3?