Thread: autovacuum: recommended?
hi, we are moving one database from postgresql-7.4 to postgresql-8.2.4. we have some cronjobs set up that vacuum the database (some tables more often, some tables less often). now, in pg82, there is the possibility of using the autovacuum. my question is: is it recommended to use it? or in other words, should i only use autovacuum? or it's better to use manual-vacuuming? which one is the "way of the future" :) ? or should i use both auto-vacuum and manual-vacuum? in other words, i'd like to find out, if we should simply stay with the vacuuming-cronjobs, or should we move to using auto-vacuum? and if we should move, should we try to set it up the way that no manual-vacuuming is used anymore? thanks, gabor
On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: > we are moving one database from postgresql-7.4 to postgresql-8.2.4. any particular reason why not 8.2.5? > > my question is: is it recommended to use it? or in other words, should i > only use autovacuum? or it's better to use manual-vacuuming? which one > is the "way of the future" :) ? or should i use both auto-vacuum and > manual-vacuum? autovacuum is definitely prefered (for most of the cases). you might want to set vacuum delays though. depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
[Gábor Farkas - Fri at 10:40:43AM +0100] > my question is: is it recommended to use it? or in other words, should i > only use autovacuum? or it's better to use manual-vacuuming? which one > is the "way of the future" :) ? or should i use both auto-vacuum and > manual-vacuum? Nightly vacuums are great if the activity on the database is very low night time. A combination is also good, the autovacuum will benefit from the nightly vacuum. My gut feeling says it's a good idea to leave autovacuum on, regardless of whether the nightly vacuums have been turned on or not. That being said, we have some huge tables in our database and pretty much traffic, and got quite some performance problems when the autovacuum kicked in and started vacuuming those huge tables, so we're currently running without. Autovacuum can be tuned to not touch those tables, but we've chosen to leave it off.
On Fri, 2007-11-16 at 12:13 +0100, Tobias Brox wrote: > [snip] should i use both auto-vacuum and > > manual-vacuum? I would say for 8.2 that's the best strategy (which might change with 8.3 and it's multiple vacuum workers thingy). > That being said, we have some huge tables in our database and pretty > much traffic, and got quite some performance problems when the > autovacuum kicked in and started vacuuming those huge tables, so we're > currently running without. Autovacuum can be tuned to not touch those > tables, but we've chosen to leave it off. We are doing that here, i.e. set up autovacuum not to touch big tables, and cover those with nightly vacuums if there is still some activity on them, and one weekly complete vacuum of the whole DB ("vacuum" without other params, preferably as the postgres user to cover system tables too). In fact we also have a few very frequently updated small tables, those are also covered by very frequent crontab vacuums because in 8.2 autovacuum can spend quite some time vacuuming some medium sized tables and in that interval the small but frequently updated ones get bloated. This should be better with 8.3 and multiple autovacuum workers. For the "disable for autovacuum" part search for pg_autovacuum in the docs. I would say the autovacuum + disable autovacuum on big tables + nightly vacuum + weekly vacuumdb + frequent crontab vacuum of very updated small tables works well in 8.2. One thing which could be needed is to also schedule continuous vacuum of big tables which are frequently updated, with big delay settings to throttle the resources used by the vacuum. We don't need that here because we don't update frequently our big tables... Cheers, Csaba.
> That being said, we have some huge tables in our database and pretty > much traffic, and got quite some performance problems when the > autovacuum kicked in and started vacuuming those huge tables, so we're > currently running without. Autovacuum can be tuned to not touch those > tables, but we've chosen to leave it off. We had some performance problems with the autovacuum on large and frequently modified tables too - but after a little bit of playing with the parameters the overall performance is much better than it was before the autovacuuming. The table was quite huge (say 20k of products along with detailed descriptions etc.) and was completely updated and about 12x each day, i.e. it qrew to about 12x the original size (and 11/12 of the rows were dead). This caused a serious slowdown of the application each day, as the database had to scan 12x more data. We set up autovacuuming with the default parameters, but it interfered with the usual traffic - we had to play a little with the parameters (increase the delays, decrease the duration or something like that) and now it runs much better than before. No nightly vacuuming, no serious performance degradation during the day, etc. So yes - autovacuuming is recommended, but in some cases the default parameters have to be tuned a little bit. tomas
[tv@fuzzy.cz] > The table was quite huge (say 20k of products along with detailed > descriptions etc.) and was completely updated and about 12x each day, i.e. > it qrew to about 12x the original size (and 11/12 of the rows were dead). > This caused a serious slowdown of the application each day, as the > database had to scan 12x more data. The tables we had problems with are transaction-type tables with millions of rows and mostly inserts to the table ... and, eventually some few attributes being updated only on the most recent entries. I tried tuning a lot, but gave it up eventually. Vacuuming those tables took a long time (even if only a very small fraction of the table was touched) and the performance of the inserts to the table was reduced to a level that could not be accepted. By now we've just upgraded the hardware, so it could be worth playing with it again, but our project manager is both paranoid and conservative and proud of it, so I would have to prove that autovacuum is good for us before I'm allowed to turn it on again ;-)
hubert depesz lubaczewski wrote: > On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: >> we are moving one database from postgresql-7.4 to postgresql-8.2.4. > > any particular reason why not 8.2.5? the distribution i use only has 8.2.4 currently. gabor
On Nov 18, 2007, at 1:26 PM, gabor wrote: > hubert depesz lubaczewski wrote: >> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: >>> we are moving one database from postgresql-7.4 to postgresql-8.2.4. >> any particular reason why not 8.2.5? > > the distribution i use only has 8.2.4 currently. Then I think you need to consider abandoning your distribution's packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months old now; there's no reason a distribution shouldn't have it at this point. (Unless of course you haven't kept your distribution up- to-date... ;) -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
On Nov 16, 2007, at 7:38 AM, tv@fuzzy.cz wrote: > The table was quite huge (say 20k of products along with detailed > descriptions etc.) and was completely updated and about 12x each > day, i.e. > it qrew to about 12x the original size (and 11/12 of the rows were > dead). > This caused a serious slowdown of the application each day, as the > database had to scan 12x more data. FWIW, 20k rows isn't all that big, so I'm assuming that the descriptions make the table very wide. Unless those descriptions are what's being updated frequently, I suggest you put those in a separate table (vertical partitioning). That will make the main table much easier to vacuum, as well as reducing the impact of the high churn rate. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
On Nov 16, 2007, at 5:56 AM, Csaba Nagy wrote: > We are doing that here, i.e. set up autovacuum not to touch big > tables, > and cover those with nightly vacuums if there is still some > activity on > them, and one weekly complete vacuum of the whole DB ("vacuum" without > other params, preferably as the postgres user to cover system tables > too). IIRC, since 8.2 autovacuum will take note of manual vacuums so as not to needlessly vacuum something that's been recently vacuumed manually. In other words, you shouldn't need to disable autovac for large tables if you vacuum them every night and their churn rate is low enough to not trigger autovacuum during the day. > In fact we also have a few very frequently updated small tables, those > are also covered by very frequent crontab vacuums because in 8.2 > autovacuum can spend quite some time vacuuming some medium sized > tables > and in that interval the small but frequently updated ones get > bloated. > This should be better with 8.3 and multiple autovacuum workers. +1. For tables that should always remain relatively small (ie: a web session table), I usually recommend setting up a manual vacuum that runs every 1-5 minutes. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
> FWIW, 20k rows isn't all that big, so I'm assuming that the > descriptions make the table very wide. Unless those descriptions are > what's being updated frequently, I suggest you put those in a > separate table (vertical partitioning). That will make the main table > much easier to vacuum, as well as reducing the impact of the high > churn rate. Yes, you're right - the table is quite wide, as it's a catalogue of a pharmacy along with all the detailed descriptions and additional info etc. So I guess it's 50 MB of data or something like that. That may not seem bad, but as I already said the table grew to about 12x the size during the day (so about 500MB of data, 450MB being dead rows). This is the 'central' table of the system, and there are other quite heavily used databases as well. Add some really stupid queries on this table (for example LIKE searches on the table) and you easily end up with 100MB of permanent I/O during the day. The vertical partitioning would be overengineering in this case - we considered even that, but proper optimization of the update process (updating only those rows that really changed), along with a little bit of autovacuum tuning solved all the performance issues. Tomas
Decibel! wrote: > On Nov 18, 2007, at 1:26 PM, gabor wrote: >> hubert depesz lubaczewski wrote: >>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: >>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4. >>> any particular reason why not 8.2.5? >> >> the distribution i use only has 8.2.4 currently. > > Then I think you need to consider abandoning your distribution's > packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months > old now; there's no reason a distribution shouldn't have it at this > point. (Unless of course you haven't kept your distribution > up-to-date... ;) Some people run distributions such as Red Hat Enterprise Linux 5 (their latest); I do. postgresql that comes with that. Now once they pick a version of a program, they seldom change it. They do put security and bug fixes in it by back-porting the changes into the source code and rebuilding it. I guess for postgresql the changes were too much for backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they will never upgrade RHEL5 to the 8.2 series because they do not do it to get new features. Now you may think there are better distributions than Red Hat Enterprise Linux 5, but enough people seem to think it good enough to pay for it and keep Red Hat in business. I doubt they are all foolish. Luckily I do not seem to be troubled by the problems experienced by the O.P. I do know that if I try to use .rpms from other sources, I can get in a lot of trouble with incompatible libraries. And I cannot upgrade the libraries without damaging other programs. -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 08:20:01 up 27 days, 1:38, 1 user, load average: 5.15, 5.20, 5.01
In response to Jean-David Beyer <jeandavid8@verizon.net>: > Decibel! wrote: > > On Nov 18, 2007, at 1:26 PM, gabor wrote: > >> hubert depesz lubaczewski wrote: > >>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: > >>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4. > >>> any particular reason why not 8.2.5? > >> > >> the distribution i use only has 8.2.4 currently. > > > > Then I think you need to consider abandoning your distribution's > > packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months > > old now; there's no reason a distribution shouldn't have it at this > > point. (Unless of course you haven't kept your distribution > > up-to-date... ;) > > Some people run distributions such as Red Hat Enterprise Linux 5 (their > latest); I do. postgresql that comes with that. > > Now once they pick a version of a program, they seldom change it. They do > put security and bug fixes in it by back-porting the changes into the source > code and rebuilding it. I guess for postgresql the changes were too much for > backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it > originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they > will never upgrade RHEL5 to the 8.2 series because they do not do it to get > new features. > > Now you may think there are better distributions than Red Hat Enterprise > Linux 5, but enough people seem to think it good enough to pay for it and > keep Red Hat in business. I doubt they are all foolish. > > Luckily I do not seem to be troubled by the problems experienced by the O.P. > > I do know that if I try to use .rpms from other sources, I can get in a lot > of trouble with incompatible libraries. And I cannot upgrade the libraries > without damaging other programs. I think you've missed the point. The discussion is not that the distro is bad because it hasn't moved from 8.1 -> 8.2. The comment is that it's bad because it hasn't updated a major branch with the latest bug fixes. i.e. it hasn't moved from 8.1.4 to 8.1.5. If this is indeed the case, I agree that such a distro isn't worth using. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
Decibel! <decibel@decibel.org> writes: > FWIW, 20k rows isn't all that big, so I'm assuming that the > descriptions make the table very wide. Unless those descriptions are > what's being updated frequently, I suggest you put those in a > separate table (vertical partitioning). That will make the main table > much easier to vacuum, as well as reducing the impact of the high > churn rate. Uh, you do realize that the TOAST mechanism does that pretty much automatically? regards, tom lane
Bill Moran wrote: > In response to Jean-David Beyer <jeandavid8@verizon.net>: > >> Decibel! wrote: >>> On Nov 18, 2007, at 1:26 PM, gabor wrote: >>>> hubert depesz lubaczewski wrote: >>>>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote: >>>>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4. >>>>> any particular reason why not 8.2.5? >>>> the distribution i use only has 8.2.4 currently. >>> Then I think you need to consider abandoning your distribution's >>> packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months >>> old now; there's no reason a distribution shouldn't have it at this >>> point. (Unless of course you haven't kept your distribution >>> up-to-date... ;) >> Some people run distributions such as Red Hat Enterprise Linux 5 (their >> latest); I do. postgresql that comes with that. >> >> Now once they pick a version of a program, they seldom change it. They do >> put security and bug fixes in it by back-porting the changes into the source >> code and rebuilding it. I guess for postgresql the changes were too much for >> backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it >> originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they >> will never upgrade RHEL5 to the 8.2 series because they do not do it to get >> new features. >> >> Now you may think there are better distributions than Red Hat Enterprise >> Linux 5, but enough people seem to think it good enough to pay for it and >> keep Red Hat in business. I doubt they are all foolish. >> [snip] > > I think you've missed the point. I think you are right. > > The discussion is not that the distro is bad because it hasn't moved from > 8.1 -> 8.2. The comment is that it's bad because it hasn't updated a > major branch with the latest bug fixes. i.e. it hasn't moved from 8.1.4 > to 8.1.5. > > If this is indeed the case, I agree that such a distro isn't worth using. > ... and I can keep RHEL5 because they went from 8.1.4 to 8.1.9. ;-) -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 10:40:01 up 27 days, 3:58, 2 users, load average: 4.43, 4.85, 5.17
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, 19 Nov 2007 08:51:42 -0500 Bill Moran <wmoran@collaborativefusion.com> wrote: > > Luckily I do not seem to be troubled by the problems experienced by > > the O.P. > > > > I do know that if I try to use .rpms from other sources, I can get > > in a lot of trouble with incompatible libraries. And I cannot > > upgrade the libraries without damaging other programs. > > I think you've missed the point. > > The discussion is not that the distro is bad because it hasn't moved > from 8.1 -> 8.2. The comment is that it's bad because it hasn't > updated a major branch with the latest bug fixes. i.e. it hasn't > moved from 8.1.4 to 8.1.5. > > If this is indeed the case, I agree that such a distro isn't worth > using. I would note, and Tom would actually be a better person to expound on this that Red Hat has a tendency (at least they used to) to leave the minor number unchanged. E.g; 8.1.4 is shipped with RHEL5 They release a service update You now have 8.1.4-1.9 Or some such drivel. They do this because application vendors wet themselves in fear if they see a version change midcyle no matter how much you tell them it is just security and data fixes... /me who has dealt with 3 "enterprise" vendors on this exact issues in the last week. Sincerely, Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHQcWGATb/zqfZUUQRAtYmAJ9QKuH/mou87XCwiBoDPiw+03ST7QCfRMlb n7+IVftfOrPBd2+CKA6B1N4= =MMKO -----END PGP SIGNATURE-----
On Mon, 19 Nov 2007, Jean-David Beyer wrote: > I am pretty sure they will never upgrade RHEL5 to the 8.2 series because > they do not do it to get new features. That's correct. > I do know that if I try to use .rpms from other sources, I can get in a > lot of trouble with incompatible libraries. And I cannot upgrade the > libraries without damaging other programs. You're also right that this is tricky. I've written a guide that goes over the main issues involved at http://www.westnet.com/~gsmith/content/postgresql/pgrpm.htm if you ever wanted to explore this as an option. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Nov 19, 2007, at 9:23 AM, Tom Lane wrote: > Decibel! <decibel@decibel.org> writes: >> FWIW, 20k rows isn't all that big, so I'm assuming that the >> descriptions make the table very wide. Unless those descriptions are >> what's being updated frequently, I suggest you put those in a >> separate table (vertical partitioning). That will make the main table >> much easier to vacuum, as well as reducing the impact of the high >> churn rate. > > Uh, you do realize that the TOAST mechanism does that pretty much > automatically? Only if the row exceeds 2k, which for a lot of applications is huge. This is exactly why I wish toast limits were configurable on a per- table basis (I know there were changes here for 8.3, but IIRC it was only for toast chunk size). -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828