Thread: VACUUM
I am trying to set up my cron job to run the appropriate vacuums on our database. I understand from reading the documentation that regular vacuums, vacuum full, and vacuum analyze are all important. Can someone assist in telling me what is the most efficient order? I would assume that analyze should be done once the inaccessible rows are released within the table and then to the OS?
We currently have a small amount of data in our database so my plan was to do all vacuums on the entire database.
Also, I read that reindexing was important. Does this need to be done as frequently as the vacuums? Is it only done on a per table basis? or can you reindex the entire database?
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu
Honestly this is very dependent on the type of schema you have and what your application does with it. If you have tables that get a lot of updates (or both inserts and deletes) you will want to run vacuum analyze on those tables more often. How often is debatable, but as a rule of thumb it good to run them vacuum analyze as least once per turnover of your table (though running it more wont hurt). As for vacuum full, with frequently enough "lazy" vacuuming you wont need to vacuum full very often. If you can afford to lock your tables, then I'd say do it once a night. If not, you need to decide when is a good time, but I think the docs recommend doing it at least once every 1 billion transactions, though more often is definitely better here too. As far as sequences go, search the archives for pgsql-general, tom lane just posted a great explanation of why sequences need to be recreated within the last day or two. Read it and if it applies work that into your crons accordingly. One final note is to research the free space map setting in the postgresql.conf. This helps tracks dead tuples in the database, so you'll want to make this large enough to store your tuple counts between vacuums. Hope this helps, Robert Treat On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote: > I am trying to set up my cron job to run the appropriate vacuums on our > database. I understand from reading the documentation that regular > vacuums, vacuum full, and vacuum analyze are all important. Can someone > assist in telling me what is the most efficient order? I would assume > that analyze should be done once the inaccessible rows are released > within the table and then to the OS? > > We currently have a small amount of data in our database so my plan was > to do all vacuums on the entire database. > > Also, I read that reindexing was important. Does this need to be done as > frequently as the vacuums? Is it only done on a per table basis? or can > you reindex the entire database? > > Thanks > Jodi > > _______________________________ > Jodi L Kanter > BioInformatics Database Administrator > University of Virginia > (434) 924-2846 > jkanter@virginia.edu <mailto:jkanter@virginia.edu> > > > > > > > > >
Robert Treat <xzilla@users.sourceforge.net> writes: > ... As for vacuum > full, with frequently enough "lazy" vacuuming you wont need to vacuum > full very often. If you can afford to lock your tables, then I'd say do > it once a night. If not, you need to decide when is a good time, but I > think the docs recommend doing it at least once every 1 billion > transactions, though more often is definitely better here too. The intention is that if you do plain ("lazy") vacuum often enough to keep your tables from bloating unreasonably, you shouldn't ever have to do a "full" vacuum. How much is "unreasonable" is up to you to decide. For instance, if under peak load you might update 10% of the rows in a particular table every hour, then hourly lazy vacuums would keep you at a steady state of about 10% wasted space in that table. That seems pretty workable to me. If you waited 100 hours then the table would have bloated to about 10X its minimum size, and at that point you'd most likely want to do a vacuum full to reclaim the wasted space. (Right now the only way to set up an appropriate vacuuming schedule is for the DBA to hand-craft a cron script. I'd like to see the database able to launch automatic background vacuum runs by itself.) The business about "you must vacuum at least every billion transactions" is a different issue. Either plain or full vacuum will satisfy that requirement. There is more discussion in the Admin Guide --- http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/maintenance.html regards, tom lane
On Thu, 2002-11-14 at 17:18, Tom Lane wrote: > Robert Treat <xzilla@users.sourceforge.net> writes: > > (Right now the only way to set up an appropriate vacuuming schedule is > for the DBA to hand-craft a cron script. I'd like to see the database > able to launch automatic background vacuum runs by itself.) > There is also an "automatic vacuum daemon" on gborg. I've looked at the code and it looks nifty, though I haven't used it myself. Robert Treat
Do you have a URL for that document you mention? I get nothing when I search on pgsql-general. Thanks for the help. Jodi ----- Original Message ----- From: "Robert Treat" <xzilla@users.sourceforge.net> To: "Jodi Kanter" <jkanter@virginia.edu> Cc: "Postgres Admin List" <pgsql-admin@postgresql.org> Sent: Thursday, November 14, 2002 4:44 PM Subject: Re: [ADMIN] VACUUM > Honestly this is very dependent on the type of schema you have and what > your application does with it. If you have tables that get a lot of > updates (or both inserts and deletes) you will want to run vacuum > analyze on those tables more often. How often is debatable, but as a > rule of thumb it good to run them vacuum analyze as least once per > turnover of your table (though running it more wont hurt). As for vacuum > full, with frequently enough "lazy" vacuuming you wont need to vacuum > full very often. If you can afford to lock your tables, then I'd say do > it once a night. If not, you need to decide when is a good time, but I > think the docs recommend doing it at least once every 1 billion > transactions, though more often is definitely better here too. As far > as sequences go, search the archives for pgsql-general, tom lane just > posted a great explanation of why sequences need to be recreated within > the last day or two. Read it and if it applies work that into your crons > accordingly. One final note is to research the free space map setting in > the postgresql.conf. This helps tracks dead tuples in the database, so > you'll want to make this large enough to store your tuple counts between > vacuums. Hope this helps, > > Robert Treat > > On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote: > > I am trying to set up my cron job to run the appropriate vacuums on our > > database. I understand from reading the documentation that regular > > vacuums, vacuum full, and vacuum analyze are all important. Can someone > > assist in telling me what is the most efficient order? I would assume > > that analyze should be done once the inaccessible rows are released > > within the table and then to the OS? > > > > We currently have a small amount of data in our database so my plan was > > to do all vacuums on the entire database. > > > > Also, I read that reindexing was important. Does this need to be done as > > frequently as the vacuums? Is it only done on a per table basis? or can > > you reindex the entire database? > > > > Thanks > > Jodi > > > > _______________________________ > > Jodi L Kanter > > BioInformatics Database Administrator > > University of Virginia > > (434) 924-2846 > > jkanter@virginia.edu <mailto:jkanter@virginia.edu> > > > > > > > > > > > > > > > > > > > > > >
http://groups.google.com/groups?q=tom+lane+index+group:comp.databases.postgresql.general&hl=en&lr=&ie=UTF-8&selm=23812.1037210966%40sss.pgh.pa.us&rnum=6 hows that for a URL? Robert Treat On Fri, 2002-11-15 at 08:27, Jodi Kanter wrote: > Do you have a URL for that document you mention? I get nothing when I search > on pgsql-general. > Thanks for the help. > Jodi > > ----- Original Message ----- > From: "Robert Treat" <xzilla@users.sourceforge.net> > To: "Jodi Kanter" <jkanter@virginia.edu> > Cc: "Postgres Admin List" <pgsql-admin@postgresql.org> > Sent: Thursday, November 14, 2002 4:44 PM > Subject: Re: [ADMIN] VACUUM > > > > Honestly this is very dependent on the type of schema you have and what > > your application does with it. If you have tables that get a lot of > > updates (or both inserts and deletes) you will want to run vacuum > > analyze on those tables more often. How often is debatable, but as a > > rule of thumb it good to run them vacuum analyze as least once per > > turnover of your table (though running it more wont hurt). As for vacuum > > full, with frequently enough "lazy" vacuuming you wont need to vacuum > > full very often. If you can afford to lock your tables, then I'd say do > > it once a night. If not, you need to decide when is a good time, but I > > think the docs recommend doing it at least once every 1 billion > > transactions, though more often is definitely better here too. As far > > as sequences go, search the archives for pgsql-general, tom lane just > > posted a great explanation of why sequences need to be recreated within > > the last day or two. Read it and if it applies work that into your crons > > accordingly. One final note is to research the free space map setting in > > the postgresql.conf. This helps tracks dead tuples in the database, so > > you'll want to make this large enough to store your tuple counts between > > vacuums. Hope this helps, > > > > Robert Treat > > > > On Thu, 2002-11-14 at 15:39, Jodi Kanter wrote: > > > I am trying to set up my cron job to run the appropriate vacuums on our > > > database. I understand from reading the documentation that regular > > > vacuums, vacuum full, and vacuum analyze are all important. Can someone > > > assist in telling me what is the most efficient order? I would assume > > > that analyze should be done once the inaccessible rows are released > > > within the table and then to the OS? > > > > > > We currently have a small amount of data in our database so my plan was > > > to do all vacuums on the entire database. > > > > > > Also, I read that reindexing was important. Does this need to be done as > > > frequently as the vacuums? Is it only done on a per table basis? or can > > > you reindex the entire database? > > > > > > Thanks > > > Jodi > > > > > > _______________________________ > > > Jodi L Kanter > > > BioInformatics Database Administrator > > > University of Virginia > > > (434) 924-2846 > > > jkanter@virginia.edu <mailto:jkanter@virginia.edu> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Thu, Nov 14, 2002 at 05:18:06PM -0500, Tom Lane wrote: > > The business about "you must vacuum at least every billion transactions" > is a different issue. Either plain or full vacuum will satisfy that > requirement. Note, also, that it is _every_ table needs to be vacuumed in that billion transaction period, but it is nevertheless probable that some of your tables will not need to be vacuumed as often as others. For efficiency's sake, therefore, you may want to vacuum different tables at different times. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110