Thread: VACUUM

VACUUM

From
Jodi Kanter
Date:
I am trying to set up my cron job to run the appropriate vacuums on our database. I understand from reading the documentation that regular vacuums, vacuum full, and vacuum analyze are all important. Can someone assist in telling me what is the most efficient order? I would assume that analyze should be done once the inaccessible rows are released within the table and then to the OS?
 
We currently have a small amount of data in our database so my plan was to do all vacuums on the entire database.
 
Also, I read that reindexing was important. Does this need to be done as frequently as the vacuums? Is it only done on a per table basis? or can you reindex the entire database?
 
Thanks
Jodi

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu


 

 

 

Re: VACUUM

From
Robert Treat
Date:
Honestly this is very dependent on the type of schema you have and what
your application does with it. If you have tables that get a lot of
updates (or both inserts and deletes) you will want to run vacuum
analyze on those tables more often. How often is debatable, but as a
rule of thumb it good to run them vacuum analyze as least once per
turnover of your table (though running it more wont hurt). As for vacuum
full, with frequently enough "lazy" vacuuming you wont need to vacuum
full very often. If you can afford to lock your tables, then I'd say do
it once a night. If not, you need to decide when is a good time, but I
think the docs recommend doing it at least once every 1 billion
transactions, though more often is definitely better here too.  As far
as sequences go, search the archives for pgsql-general, tom lane just
posted a great explanation of why sequences need to be recreated within
the last day or two. Read it and if it applies work that into your crons
accordingly. One final note is to research the free space map setting in
the postgresql.conf.  This helps tracks dead tuples in the database, so
you'll want to make this large enough to store your tuple counts between
vacuums. Hope this helps,

Robert Treat

On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote:
> I am trying to set up my cron job to run the appropriate vacuums on our
> database. I understand from reading the documentation that regular
> vacuums, vacuum full, and vacuum analyze are all important. Can someone
> assist in telling me what is the most efficient order? I would assume
> that analyze should be done once the inaccessible rows are released
> within the table and then to the OS?
>
> We currently have a small amount of data in our database so my plan was
> to do all vacuums on the entire database.
>
> Also, I read that reindexing was important. Does this need to be done as
> frequently as the vacuums? Is it only done on a per table basis? or can
> you reindex the entire database?
>
> Thanks
> Jodi
>
> _______________________________
> Jodi L Kanter
> BioInformatics Database Administrator
> University of Virginia
> (434) 924-2846
> jkanter@virginia.edu <mailto:jkanter@virginia.edu>
>
>
>
>
>
>
>
>
>




Re: VACUUM

From
Tom Lane
Date:
Robert Treat <xzilla@users.sourceforge.net> writes:
> ... As for vacuum
> full, with frequently enough "lazy" vacuuming you wont need to vacuum
> full very often. If you can afford to lock your tables, then I'd say do
> it once a night. If not, you need to decide when is a good time, but I
> think the docs recommend doing it at least once every 1 billion
> transactions, though more often is definitely better here too.

The intention is that if you do plain ("lazy") vacuum often enough to
keep your tables from bloating unreasonably, you shouldn't ever have to
do a "full" vacuum.  How much is "unreasonable" is up to you to decide.

For instance, if under peak load you might update 10% of the rows in
a particular table every hour, then hourly lazy vacuums would keep you
at a steady state of about 10% wasted space in that table.  That seems
pretty workable to me.  If you waited 100 hours then the table would
have bloated to about 10X its minimum size, and at that point you'd
most likely want to do a vacuum full to reclaim the wasted space.

(Right now the only way to set up an appropriate vacuuming schedule is
for the DBA to hand-craft a cron script.  I'd like to see the database
able to launch automatic background vacuum runs by itself.)

The business about "you must vacuum at least every billion transactions"
is a different issue.  Either plain or full vacuum will satisfy that
requirement.

There is more discussion in the Admin Guide ---
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/maintenance.html

            regards, tom lane

Re: VACUUM

From
Robert Treat
Date:
On Thu, 2002-11-14 at 17:18, Tom Lane wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:
>
> (Right now the only way to set up an appropriate vacuuming schedule is
> for the DBA to hand-craft a cron script.  I'd like to see the database
> able to launch automatic background vacuum runs by itself.)
>

There is also an "automatic vacuum daemon" on gborg. I've looked at the
code and it looks nifty, though I haven't used it myself.

Robert Treat


Re: VACUUM

From
Jodi Kanter
Date:
Do you have a URL for that document you mention? I get nothing when I search
on pgsql-general.
Thanks for the help.
Jodi

----- Original Message -----
From: "Robert Treat" <xzilla@users.sourceforge.net>
To: "Jodi Kanter" <jkanter@virginia.edu>
Cc: "Postgres Admin List" <pgsql-admin@postgresql.org>
Sent: Thursday, November 14, 2002 4:44 PM
Subject: Re: [ADMIN] VACUUM


> Honestly this is very dependent on the type of schema you have and what
> your application does with it. If you have tables that get a lot of
> updates (or both inserts and deletes) you will want to run vacuum
> analyze on those tables more often. How often is debatable, but as a
> rule of thumb it good to run them vacuum analyze as least once per
> turnover of your table (though running it more wont hurt). As for vacuum
> full, with frequently enough "lazy" vacuuming you wont need to vacuum
> full very often. If you can afford to lock your tables, then I'd say do
> it once a night. If not, you need to decide when is a good time, but I
> think the docs recommend doing it at least once every 1 billion
> transactions, though more often is definitely better here too.  As far
> as sequences go, search the archives for pgsql-general, tom lane just
> posted a great explanation of why sequences need to be recreated within
> the last day or two. Read it and if it applies work that into your crons
> accordingly. One final note is to research the free space map setting in
> the postgresql.conf.  This helps tracks dead tuples in the database, so
> you'll want to make this large enough to store your tuple counts between
> vacuums. Hope this helps,
>
> Robert Treat
>
> On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote:
> > I am trying to set up my cron job to run the appropriate vacuums on our
> > database. I understand from reading the documentation that regular
> > vacuums, vacuum full, and vacuum analyze are all important. Can someone
> > assist in telling me what is the most efficient order? I would assume
> > that analyze should be done once the inaccessible rows are released
> > within the table and then to the OS?
> >
> > We currently have a small amount of data in our database so my plan was
> > to do all vacuums on the entire database.
> >
> > Also, I read that reindexing was important. Does this need to be done as
> > frequently as the vacuums? Is it only done on a per table basis? or can
> > you reindex the entire database?
> >
> > Thanks
> > Jodi
> >
> > _______________________________
> > Jodi L Kanter
> > BioInformatics Database Administrator
> > University of Virginia
> > (434) 924-2846
> > jkanter@virginia.edu <mailto:jkanter@virginia.edu>
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
>


Re: VACUUM

From
Robert Treat
Date:
http://groups.google.com/groups?q=tom+lane+index+group:comp.databases.postgresql.general&hl=en&lr=&ie=UTF-8&selm=23812.1037210966%40sss.pgh.pa.us&rnum=6

hows that for a URL?

Robert Treat

On Fri, 2002-11-15 at 08:27, Jodi Kanter wrote:
> Do you have a URL for that document you mention? I get nothing when I search
> on pgsql-general.
> Thanks for the help.
> Jodi
>
> ----- Original Message -----
> From: "Robert Treat" <xzilla@users.sourceforge.net>
> To: "Jodi Kanter" <jkanter@virginia.edu>
> Cc: "Postgres Admin List" <pgsql-admin@postgresql.org>
> Sent: Thursday, November 14, 2002 4:44 PM
> Subject: Re: [ADMIN] VACUUM
>
>
> > Honestly this is very dependent on the type of schema you have and what
> > your application does with it. If you have tables that get a lot of
> > updates (or both inserts and deletes) you will want to run vacuum
> > analyze on those tables more often. How often is debatable, but as a
> > rule of thumb it good to run them vacuum analyze as least once per
> > turnover of your table (though running it more wont hurt). As for vacuum
> > full, with frequently enough "lazy" vacuuming you wont need to vacuum
> > full very often. If you can afford to lock your tables, then I'd say do
> > it once a night. If not, you need to decide when is a good time, but I
> > think the docs recommend doing it at least once every 1 billion
> > transactions, though more often is definitely better here too.  As far
> > as sequences go, search the archives for pgsql-general, tom lane just
> > posted a great explanation of why sequences need to be recreated within
> > the last day or two. Read it and if it applies work that into your crons
> > accordingly. One final note is to research the free space map setting in
> > the postgresql.conf.  This helps tracks dead tuples in the database, so
> > you'll want to make this large enough to store your tuple counts between
> > vacuums. Hope this helps,
> >
> > Robert Treat
> >
> > On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote:
> > > I am trying to set up my cron job to run the appropriate vacuums on our
> > > database. I understand from reading the documentation that regular
> > > vacuums, vacuum full, and vacuum analyze are all important. Can someone
> > > assist in telling me what is the most efficient order? I would assume
> > > that analyze should be done once the inaccessible rows are released
> > > within the table and then to the OS?
> > >
> > > We currently have a small amount of data in our database so my plan was
> > > to do all vacuums on the entire database.
> > >
> > > Also, I read that reindexing was important. Does this need to be done as
> > > frequently as the vacuums? Is it only done on a per table basis? or can
> > > you reindex the entire database?
> > >
> > > Thanks
> > > Jodi
> > >
> > > _______________________________
> > > Jodi L Kanter
> > > BioInformatics Database Administrator
> > > University of Virginia
> > > (434) 924-2846
> > > jkanter@virginia.edu <mailto:jkanter@virginia.edu>
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



Re: VACUUM

From
Andrew Sullivan
Date:
On Thu, Nov 14, 2002 at 05:18:06PM -0500, Tom Lane wrote:
>
> The business about "you must vacuum at least every billion transactions"
> is a different issue.  Either plain or full vacuum will satisfy that
> requirement.

Note, also, that it is _every_ table needs to be vacuumed in that
billion transaction period, but it is nevertheless probable that some
of your tables will not need to be vacuumed as often as others.  For
efficiency's sake, therefore, you may want to vacuum different tables
at different times.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110