Thread: autovacuum: recommended?

autovacuum: recommended?

From
Gábor Farkas
Date:
hi,

we are moving one database from postgresql-7.4 to postgresql-8.2.4.

we have some cronjobs set up that vacuum the database (some tables more
often, some tables less often). now, in pg82, there is the possibility
of using the autovacuum.

my question is: is it recommended to use it? or in other words, should i
only use autovacuum? or it's better to use manual-vacuuming? which one
is the "way of the future" :) ? or should i use both auto-vacuum and
manual-vacuum?

in other words, i'd like to find out, if we should simply stay with the
vacuuming-cronjobs, or should we move to using auto-vacuum? and if we
should move, should we try to set it up the way that no manual-vacuuming
is used anymore?

thanks,
gabor

Re: autovacuum: recommended?

From
hubert depesz lubaczewski
Date:
On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
> we are moving one database from postgresql-7.4 to postgresql-8.2.4.

any particular reason why not 8.2.5?
>
> my question is: is it recommended to use it? or in other words, should i
> only use autovacuum? or it's better to use manual-vacuuming? which one
> is the "way of the future" :) ? or should i use both auto-vacuum and
> manual-vacuum?

autovacuum is definitely prefered (for most of the cases).

you might want to set vacuum delays though.

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: autovacuum: recommended?

From
Tobias Brox
Date:
[Gábor Farkas - Fri at 10:40:43AM +0100]
> my question is: is it recommended to use it? or in other words, should i
> only use autovacuum? or it's better to use manual-vacuuming? which one
> is the "way of the future" :) ? or should i use both auto-vacuum and
> manual-vacuum?

Nightly vacuums are great if the activity on the database is very low
night time.  A combination is also good, the autovacuum will benefit
from the nightly vacuum.  My gut feeling says it's a good idea to leave
autovacuum on, regardless of whether the nightly vacuums have been
turned on or not.

That being said, we have some huge tables in our database and pretty
much traffic, and got quite some performance problems when the
autovacuum kicked in and started vacuuming those huge tables, so we're
currently running without.  Autovacuum can be tuned to not touch those
tables, but we've chosen to leave it off.


Re: autovacuum: recommended?

From
Csaba Nagy
Date:
On Fri, 2007-11-16 at 12:13 +0100, Tobias Brox wrote:
> [snip] should i use both auto-vacuum and
> > manual-vacuum?

I would say for 8.2 that's the best strategy (which might change with
8.3 and it's multiple vacuum workers thingy).

> That being said, we have some huge tables in our database and pretty
> much traffic, and got quite some performance problems when the
> autovacuum kicked in and started vacuuming those huge tables, so we're
> currently running without.  Autovacuum can be tuned to not touch those
> tables, but we've chosen to leave it off.

We are doing that here, i.e. set up autovacuum not to touch big tables,
and cover those with nightly vacuums if there is still some activity on
them, and one weekly complete vacuum of the whole DB ("vacuum" without
other params, preferably as the postgres user to cover system tables
too).

In fact we also have a few very frequently updated small tables, those
are also covered by very frequent crontab vacuums because in 8.2
autovacuum can spend quite some time vacuuming some medium sized tables
and in that interval the small but frequently updated ones get bloated.
This should be better with 8.3 and multiple autovacuum workers.

For the "disable for autovacuum" part search for pg_autovacuum in the
docs.

I would say the autovacuum + disable autovacuum on big tables + nightly
vacuum + weekly vacuumdb + frequent crontab vacuum of very updated small
tables works well in 8.2. One thing which could be needed is to also
schedule continuous vacuum of big tables which are frequently updated,
with big delay settings to throttle the resources used by the vacuum. We
don't need that here because we don't update frequently our big
tables...

Cheers,
Csaba.



Re: autovacuum: recommended?

From
tv@fuzzy.cz
Date:
> That being said, we have some huge tables in our database and pretty
> much traffic, and got quite some performance problems when the
> autovacuum kicked in and started vacuuming those huge tables, so we're
> currently running without.  Autovacuum can be tuned to not touch those
> tables, but we've chosen to leave it off.

We had some performance problems with the autovacuum on large and
frequently modified tables too - but after a little bit of playing with
the parameters the overall performance is much better than it was before
the autovacuuming.

The table was quite huge (say 20k of products along with detailed
descriptions etc.) and was completely updated and about 12x each day, i.e.
it qrew to about 12x the original size (and 11/12 of the rows were dead).
This caused a serious slowdown of the application each day, as the
database had to scan 12x more data.

We set up autovacuuming with the default parameters, but it interfered
with the usual traffic - we had to play a little with the parameters
(increase the delays, decrease the duration or something like that) and
now it runs much better than before. No nightly vacuuming, no serious
performance degradation during the day, etc.

So yes - autovacuuming is recommended, but in some cases the default
parameters have to be tuned a little bit.

tomas


Re: autovacuum: recommended?

From
Tobias Brox
Date:
[tv@fuzzy.cz]
> The table was quite huge (say 20k of products along with detailed
> descriptions etc.) and was completely updated and about 12x each day, i.e.
> it qrew to about 12x the original size (and 11/12 of the rows were dead).
> This caused a serious slowdown of the application each day, as the
> database had to scan 12x more data.

The tables we had problems with are transaction-type tables with
millions of rows and mostly inserts to the table ... and, eventually
some few attributes being updated only on the most recent entries.  I
tried tuning a lot, but gave it up eventually.  Vacuuming those tables
took a long time (even if only a very small fraction of the table was
touched) and the performance of the inserts to the table was reduced to
a level that could not be accepted.

By now we've just upgraded the hardware, so it could be worth playing
with it again, but our project manager is both paranoid and conservative
and proud of it, so I would have to prove that autovacuum is good for us
before I'm allowed to turn it on again ;-)


Re: autovacuum: recommended?

From
gabor
Date:
hubert depesz lubaczewski wrote:
> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
>> we are moving one database from postgresql-7.4 to postgresql-8.2.4.
>
> any particular reason why not 8.2.5?

the distribution i use only has 8.2.4 currently.

gabor

Re: autovacuum: recommended?

From
Decibel!
Date:
On Nov 18, 2007, at 1:26 PM, gabor wrote:
> hubert depesz lubaczewski wrote:
>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4.
>> any particular reason why not 8.2.5?
>
> the distribution i use only has 8.2.4 currently.

Then I think you need to consider abandoning your distribution's
packages or find a better distribution. IIRC, 8.2.5 is over 2-3
months old now; there's no reason a distribution shouldn't have it at
this point. (Unless of course you haven't kept your distribution up-
to-date... ;)
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: autovacuum: recommended?

From
Decibel!
Date:
On Nov 16, 2007, at 7:38 AM, tv@fuzzy.cz wrote:
> The table was quite huge (say 20k of products along with detailed
> descriptions etc.) and was completely updated and about 12x each
> day, i.e.
> it qrew to about 12x the original size (and 11/12 of the rows were
> dead).
> This caused a serious slowdown of the application each day, as the
> database had to scan 12x more data.

FWIW, 20k rows isn't all that big, so I'm assuming that the
descriptions make the table very wide. Unless those descriptions are
what's being updated frequently, I suggest you put those in a
separate table (vertical partitioning). That will make the main table
much easier to vacuum, as well as reducing the impact of the high
churn rate.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: autovacuum: recommended?

From
Decibel!
Date:
On Nov 16, 2007, at 5:56 AM, Csaba Nagy wrote:
> We are doing that here, i.e. set up autovacuum not to touch big
> tables,
> and cover those with nightly vacuums if there is still some
> activity on
> them, and one weekly complete vacuum of the whole DB ("vacuum" without
> other params, preferably as the postgres user to cover system tables
> too).

IIRC, since 8.2 autovacuum will take note of manual vacuums so as not
to needlessly vacuum something that's been recently vacuumed
manually. In other words, you shouldn't need to disable autovac for
large tables if you vacuum them every night and their churn rate is
low enough to not trigger autovacuum during the day.

> In fact we also have a few very frequently updated small tables, those
> are also covered by very frequent crontab vacuums because in 8.2
> autovacuum can spend quite some time vacuuming some medium sized
> tables
> and in that interval the small but frequently updated ones get
> bloated.
> This should be better with 8.3 and multiple autovacuum workers.

+1. For tables that should always remain relatively small (ie: a web
session table), I usually recommend setting up a manual vacuum that
runs every 1-5 minutes.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: autovacuum: recommended?

From
tv@fuzzy.cz
Date:
> FWIW, 20k rows isn't all that big, so I'm assuming that the
> descriptions make the table very wide. Unless those descriptions are
> what's being updated frequently, I suggest you put those in a
> separate table (vertical partitioning). That will make the main table
> much easier to vacuum, as well as reducing the impact of the high
> churn rate.

Yes, you're right - the table is quite wide, as it's a catalogue of a
pharmacy along with all the detailed descriptions and additional info etc.
So I guess it's 50 MB of data or something like that. That may not seem
bad, but as I already said the table grew to about 12x the size during the
day (so about 500MB of data, 450MB being dead rows). This is the 'central'
table of the system, and there are other quite heavily used databases as
well. Add some really stupid queries on this table (for example LIKE
searches on the table) and you easily end up with 100MB of permanent I/O
during the day.

The vertical partitioning would be overengineering in this case - we
considered even that, but proper optimization of the update process
(updating only those rows that really changed), along with a little bit of
autovacuum tuning solved all the performance issues.

Tomas


Re: autovacuum: recommended?

From
Jean-David Beyer
Date:
Decibel! wrote:
> On Nov 18, 2007, at 1:26 PM, gabor wrote:
>> hubert depesz lubaczewski wrote:
>>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
>>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4.
>>> any particular reason why not 8.2.5?
>>
>> the distribution i use only has 8.2.4 currently.
>
> Then I think you need to consider abandoning your distribution's
> packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months
> old now; there's no reason a distribution shouldn't have it at this
> point. (Unless of course you haven't kept your distribution
> up-to-date... ;)

Some people run distributions such as Red Hat Enterprise Linux 5 (their
latest); I do. postgresql that comes with that.

Now once they pick a version of a program, they seldom change it. They do
put security and bug fixes in it by back-porting the changes into the source
code and rebuilding it. I guess for postgresql the changes were too much for
backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it
originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they
will never upgrade RHEL5 to the 8.2 series because they do not do it to get
new features.

Now you may think there are better distributions than Red Hat Enterprise
Linux 5, but enough people seem to think it good enough to pay for it and
keep Red Hat in business. I doubt they are all foolish.

Luckily I do not seem to be troubled by the problems experienced by the O.P.

I do know that if I try to use .rpms from other sources, I can get in a lot
of trouble with incompatible libraries. And I cannot upgrade the libraries
without damaging other programs.

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
 /( )\ Shrewsbury, New Jersey    http://counter.li.org
 ^^-^^ 08:20:01 up 27 days, 1:38, 1 user, load average: 5.15, 5.20, 5.01

Re: autovacuum: recommended?

From
Bill Moran
Date:
In response to Jean-David Beyer <jeandavid8@verizon.net>:

> Decibel! wrote:
> > On Nov 18, 2007, at 1:26 PM, gabor wrote:
> >> hubert depesz lubaczewski wrote:
> >>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
> >>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4.
> >>> any particular reason why not 8.2.5?
> >>
> >> the distribution i use only has 8.2.4 currently.
> >
> > Then I think you need to consider abandoning your distribution's
> > packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months
> > old now; there's no reason a distribution shouldn't have it at this
> > point. (Unless of course you haven't kept your distribution
> > up-to-date... ;)
>
> Some people run distributions such as Red Hat Enterprise Linux 5 (their
> latest); I do. postgresql that comes with that.
>
> Now once they pick a version of a program, they seldom change it. They do
> put security and bug fixes in it by back-porting the changes into the source
> code and rebuilding it. I guess for postgresql the changes were too much for
> backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it
> originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they
> will never upgrade RHEL5 to the 8.2 series because they do not do it to get
> new features.
>
> Now you may think there are better distributions than Red Hat Enterprise
> Linux 5, but enough people seem to think it good enough to pay for it and
> keep Red Hat in business. I doubt they are all foolish.
>
> Luckily I do not seem to be troubled by the problems experienced by the O.P.
>
> I do know that if I try to use .rpms from other sources, I can get in a lot
> of trouble with incompatible libraries. And I cannot upgrade the libraries
> without damaging other programs.

I think you've missed the point.

The discussion is not that the distro is bad because it hasn't moved from
8.1 -> 8.2.  The comment is that it's bad because it hasn't updated a
major branch with the latest bug fixes.  i.e. it hasn't moved from 8.1.4
to 8.1.5.

If this is indeed the case, I agree that such a distro isn't worth using.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: autovacuum: recommended?

From
Tom Lane
Date:
Decibel! <decibel@decibel.org> writes:
> FWIW, 20k rows isn't all that big, so I'm assuming that the
> descriptions make the table very wide. Unless those descriptions are
> what's being updated frequently, I suggest you put those in a
> separate table (vertical partitioning). That will make the main table
> much easier to vacuum, as well as reducing the impact of the high
> churn rate.

Uh, you do realize that the TOAST mechanism does that pretty much
automatically?

            regards, tom lane

Re: autovacuum: recommended?

From
Jean-David Beyer
Date:
Bill Moran wrote:
> In response to Jean-David Beyer <jeandavid8@verizon.net>:
>
>> Decibel! wrote:
>>> On Nov 18, 2007, at 1:26 PM, gabor wrote:
>>>> hubert depesz lubaczewski wrote:
>>>>> On Fri, Nov 16, 2007 at 10:40:43AM +0100, Gábor Farkas wrote:
>>>>>> we are moving one database from postgresql-7.4 to postgresql-8.2.4.
>>>>> any particular reason why not 8.2.5?
>>>> the distribution i use only has 8.2.4 currently.
>>> Then I think you need to consider abandoning your distribution's
>>> packages or find a better distribution. IIRC, 8.2.5 is over 2-3 months
>>> old now; there's no reason a distribution shouldn't have it at this
>>> point. (Unless of course you haven't kept your distribution
>>> up-to-date... ;)
>> Some people run distributions such as Red Hat Enterprise Linux 5 (their
>> latest); I do. postgresql that comes with that.
>>
>> Now once they pick a version of a program, they seldom change it. They do
>> put security and bug fixes in it by back-porting the changes into the source
>> code and rebuilding it. I guess for postgresql the changes were too much for
>> backporting, so they upgraded from postgresql-8.1.4-1.1 that came with it
>> originally and are now up to postgresql-8.1.9-1.el5. I am pretty sure they
>> will never upgrade RHEL5 to the 8.2 series because they do not do it to get
>> new features.
>>
>> Now you may think there are better distributions than Red Hat Enterprise
>> Linux 5, but enough people seem to think it good enough to pay for it and
>> keep Red Hat in business. I doubt they are all foolish.
>>
[snip]
>
> I think you've missed the point.

I think you are right.
>
> The discussion is not that the distro is bad because it hasn't moved from
> 8.1 -> 8.2.  The comment is that it's bad because it hasn't updated a
> major branch with the latest bug fixes.  i.e. it hasn't moved from 8.1.4
> to 8.1.5.
>
> If this is indeed the case, I agree that such a distro isn't worth using.
>
... and I can keep RHEL5 because they went from 8.1.4 to 8.1.9. ;-)

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
 /( )\ Shrewsbury, New Jersey    http://counter.li.org
 ^^-^^ 10:40:01 up 27 days, 3:58, 2 users, load average: 4.43, 4.85, 5.17

Re: autovacuum: recommended?

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 19 Nov 2007 08:51:42 -0500
Bill Moran <wmoran@collaborativefusion.com> wrote:

> > Luckily I do not seem to be troubled by the problems experienced by
> > the O.P.
> > 
> > I do know that if I try to use .rpms from other sources, I can get
> > in a lot of trouble with incompatible libraries. And I cannot
> > upgrade the libraries without damaging other programs.
> 
> I think you've missed the point.
> 
> The discussion is not that the distro is bad because it hasn't moved
> from 8.1 -> 8.2.  The comment is that it's bad because it hasn't
> updated a major branch with the latest bug fixes.  i.e. it hasn't
> moved from 8.1.4 to 8.1.5.
> 
> If this is indeed the case, I agree that such a distro isn't worth
> using.

I would note, and Tom would actually be a better person to expound on
this that Red Hat has a tendency (at least they used to) to leave the
minor number unchanged. E.g;

8.1.4 is shipped with RHEL5 
They release a service update
You now have 8.1.4-1.9

Or some such drivel. They do this because application vendors wet
themselves in fear if they see a version change midcyle no matter how
much you tell them it is just security and data fixes...

/me who has dealt with 3 "enterprise" vendors on this exact issues in
the last week.

Sincerely,

Joshua D. Drake 


- -- 

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHQcWGATb/zqfZUUQRAtYmAJ9QKuH/mou87XCwiBoDPiw+03ST7QCfRMlb
n7+IVftfOrPBd2+CKA6B1N4=
=MMKO
-----END PGP SIGNATURE-----

Re: autovacuum: recommended?

From
Greg Smith
Date:
On Mon, 19 Nov 2007, Jean-David Beyer wrote:

> I am pretty sure they will never upgrade RHEL5 to the 8.2 series because
> they do not do it to get new features.

That's correct.

> I do know that if I try to use .rpms from other sources, I can get in a
> lot of trouble with incompatible libraries. And I cannot upgrade the
> libraries without damaging other programs.

You're also right that this is tricky.  I've written a guide that goes
over the main issues involved at
http://www.westnet.com/~gsmith/content/postgresql/pgrpm.htm if you ever
wanted to explore this as an option.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: autovacuum: recommended?

From
Decibel!
Date:
On Nov 19, 2007, at 9:23 AM, Tom Lane wrote:
> Decibel! <decibel@decibel.org> writes:
>> FWIW, 20k rows isn't all that big, so I'm assuming that the
>> descriptions make the table very wide. Unless those descriptions are
>> what's being updated frequently, I suggest you put those in a
>> separate table (vertical partitioning). That will make the main table
>> much easier to vacuum, as well as reducing the impact of the high
>> churn rate.
>
> Uh, you do realize that the TOAST mechanism does that pretty much
> automatically?


Only if the row exceeds 2k, which for a lot of applications is huge.
This is exactly why I wish toast limits were configurable on a per-
table basis (I know there were changes here for 8.3, but IIRC it was
only for toast chunk size).
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment