Thread: autovacuum strategy / parameters

autovacuum strategy / parameters

From

Rick

Date:

22 April 2010, 15:49:04

I have a DB with small and large tables that can go up to 15G.
For performance benefits, it appears that analyze has much less cost
than vacuum, but the same benefits?
I can’t find any clear recommendations for frequencies and am
considering these parameters:

Autovacuum_vacuum_threshold = 50000
Autovacuum_analyze_threshold = 10000
Autovacuum_vacuum_scale_factor = 0.01
Autovacuum_analyze_scale_factor = 0.005

This appears it will result in table analyzes occurring around 10,000
to 85,000 dead tuples and vacuum occuring around 50,000 to 200,000,
depending on the table sizes.

Can anyone comment on whether this is the right strategy and targets
to use?

Re: autovacuum strategy / parameters

From

Robert Haas

Date:

22 April 2010, 15:55:31

On Wed, Apr 21, 2010 at 11:06 AM, Rick <richard.branton@ca.com> wrote:
> I have a DB with small and large tables that can go up to 15G.
> For performance benefits, it appears that analyze has much less cost
> than vacuum, but the same benefits?

Err, no.  ANALYZE gathers statistics for the query planner; VACUUM
clears out old, dead tuples so that space can be reused by the
database system.

> I can’t find any clear recommendations for frequencies and am
> considering these parameters:
>
> Autovacuum_vacuum_threshold = 50000
> Autovacuum_analyze_threshold = 10000
> Autovacuum_vacuum_scale_factor = 0.01
> Autovacuum_analyze_scale_factor = 0.005
>
> This appears it will result in table analyzes occurring around 10,000
> to 85,000 dead tuples and vacuum occuring around 50,000 to 200,000,
> depending on the table sizes.
>
> Can anyone comment on whether this is the right strategy and targets
> to use?

I'm not that familiar with tuning these parameters but increasing the
default thesholds by a thousand-fold doesn't seem like a good idea.
Small tables will never get vacuumed or analyzed at all.

...Robert

Re: autovacuum strategy / parameters

From

Rick

Date:

26 April 2010, 13:10:05

On Apr 22, 2:55 pm, robertmh...@gmail.com (Robert Haas) wrote:
> On Wed, Apr 21, 2010 at 11:06 AM, Rick <richard.bran...@ca.com> wrote:
> > I have a DB with small and large tables that can go up to 15G.
> > For performance benefits, it appears that analyze has much less cost
> > than vacuum, but the same benefits?
>
> Err, no.  ANALYZE gathers statistics for the query planner; VACUUM
> clears out old, dead tuples so that space can be reused by the
> database system.
>
> > I can’t find any clear recommendations for frequencies and am
> > considering these parameters:
>
> > Autovacuum_vacuum_threshold = 50000
> > Autovacuum_analyze_threshold = 10000
> > Autovacuum_vacuum_scale_factor = 0.01
> > Autovacuum_analyze_scale_factor = 0.005
>
> > This appears it will result in table analyzes occurring around 10,000
> > to 85,000 dead tuples and vacuum occuring around 50,000 to 200,000,
> > depending on the table sizes.
>
> > Can anyone comment on whether this is the right strategy and targets
> > to use?
>
> I'm not that familiar with tuning these parameters but increasing the
> default thesholds by a thousand-fold doesn't seem like a good idea.
> Small tables will never get vacuumed or analyzed at all.
>
> ...Robert
>
> --
> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance

The problem is with the autovacuum formula:

In a loop, autovacuum checks to see if number of dead tuples >
((number of live tuples * autovacuum_vacuum_scale_factor) +
autovacuum_vacuum_threshold), and if
so, it runs VACUUM. If not, it sleeps. It works the same way for
ANALYZE.

So, in a large table, the scale_factor is the dominant term. In a
small
table, the threshold is the dominant term. But both are taken into
account.

The default values are set for small tables; it is not being run for
large tables.
The question boils down to exactly what is the max number of dead
tuples that should be allowed to accumulate before running analyze?
Since vacuum just recovers space, that doesn't seem to be nearly as
critical for performance?

-Rick

Re: autovacuum strategy / parameters

From

Alvaro Herrera

Date:

26 April 2010, 15:28:11

Rick wrote:

> So, in a large table, the scale_factor is the dominant term. In a
> small
> table, the threshold is the dominant term. But both are taken into
> account.

Correct.

> The default values are set for small tables; it is not being run for
> large tables.

So decrease the scale factor and leave threshold alone.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: autovacuum strategy / parameters

From

Robert Haas

Date:

27 April 2010, 21:55:24

On Thu, Apr 22, 2010 at 4:42 PM, Rick <richard.branton@ca.com> wrote:
> On Apr 22, 2:55 pm, robertmh...@gmail.com (Robert Haas) wrote:
>> On Wed, Apr 21, 2010 at 11:06 AM, Rick <richard.bran...@ca.com> wrote:
>> > I have a DB with small and large tables that can go up to 15G.
>> > For performance benefits, it appears that analyze has much less cost
>> > than vacuum, but the same benefits?
>>
>> Err, no.  ANALYZE gathers statistics for the query planner; VACUUM
>> clears out old, dead tuples so that space can be reused by the
>> database system.
>>
>> > I can’t find any clear recommendations for frequencies and am
>> > considering these parameters:
>>
>> > Autovacuum_vacuum_threshold = 50000
>> > Autovacuum_analyze_threshold = 10000
>> > Autovacuum_vacuum_scale_factor = 0.01
>> > Autovacuum_analyze_scale_factor = 0.005
>>
>> > This appears it will result in table analyzes occurring around 10,000
>> > to 85,000 dead tuples and vacuum occuring around 50,000 to 200,000,
>> > depending on the table sizes.
>>
>> > Can anyone comment on whether this is the right strategy and targets
>> > to use?
>>
>> I'm not that familiar with tuning these parameters but increasing the
>> default thesholds by a thousand-fold doesn't seem like a good idea.
>> Small tables will never get vacuumed or analyzed at all.
>>
>> ...Robert
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
>> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance
>
> The problem is with the autovacuum formula:
>
> In a loop, autovacuum checks to see if number of dead tuples >
> ((number of live tuples * autovacuum_vacuum_scale_factor) +
> autovacuum_vacuum_threshold), and if
> so, it runs VACUUM. If not, it sleeps. It works the same way for
> ANALYZE.
>
> So, in a large table, the scale_factor is the dominant term. In a
> small
> table, the threshold is the dominant term. But both are taken into
> account.
>
> The default values are set for small tables; it is not being run for
> large tables.
> The question boils down to exactly what is the max number of dead
> tuples that should be allowed to accumulate before running analyze?
> Since vacuum just recovers space, that doesn't seem to be nearly as
> critical for performance?

That doesn't really match my experience.  Without regular vacuuming,
tables and indices end up being larger than they ought to be and
contain large amounts of dead space that slows things down.  How much
of an impact that ends up having depends on how badly bloated they are
and what you're trying to do, but it can get very ugly.

My guess is that the reason we run ANALYZE more frequently than vacuum
(with the default settings) is that ANALYZE is pretty cheap.  In many
cases, if the statistical distribution of the data hasn't changed
much, then it's not really necessary, but it doesn't cost much either.
 And for certain types of usage patterns, like time series (where the
maximum value keeps increasing) it's REALLY important to analyze
frequently.

But having said that, on the systems I've worked with, I've only
rarely seen a problem caused by not analyzing frequently enough.  On
the other hand, I've seen MANY problems caused by not vacuuming
enough.  Someone runs a couple of big queries that rewrite a large
portion of a table several times over and, boom, problems.  8.3 and
higher are better about this because of an optimization called HOT,
but there can still be problems.

Other people's experiences may not match mine, but the bottom line is
that you need to do both of these things, and you need to make sure
they happen regularly.  In most cases, the CPU and I/O time they
consume will be amply repaid in improved query performance.

...Robert

Re: autovacuum strategy / parameters

From

"Kevin Grittner"

Date:

28 April 2010, 10:53:46

Robert Haas <robertmhaas@gmail.com> wrote:
> Rick <richard.branton@ca.com> wrote:

>> Since vacuum just recovers space, that doesn't seem to be nearly
>> as critical for performance?
>
> That doesn't really match my experience.  Without regular
> vacuuming, tables and indices end up being larger than they ought
> to be and contain large amounts of dead space that slows things
> down.  How much of an impact that ends up having depends on how
> badly bloated they are and what you're trying to do, but it can
> get very ugly.

That has been my experience, too.  When we first started using
PostgreSQL, we noticed a performance hit when some small tables
which were updated very frequently were vacuumed.  Our knee-jerk
reaction was to tune autovacuum to be less aggressive, so that we
didn't get hit with the pain as often.  Of course, things just got
worse, because every access to that table, when vacuum hadn't been
run recently, had to examine all versions of the desired row, and
test visibility for each version, to find the current one.  So
performance fell off even worse.  So we went to much more aggressive
settings for autovacuum (although only slightly more aggressive than
what has since become the default) and the problems disappeared.

Basically, as long as small tables are not allowed to bloat,
vacuuming them is so fast that you never notice it.

> 8.3 and higher are better about this because of an optimization
> called HOT, but there can still be problems.

Agreed on both counts.

-Kevin

Re: autovacuum strategy / parameters

From

Thomas Kellerer

Date:

28 April 2010, 11:21:24

Rick, 22.04.2010 22:42:
>
> So, in a large table, the scale_factor is the dominant term. In a
> small table, the threshold is the dominant term. But both are taken into
> account.
>
> The default values are set for small tables; it is not being run for
> large tables.

With 8.4 you can adjust the autovacuum settings per table...

Re: autovacuum strategy / parameters

From

akp geek

Date:

28 April 2010, 11:37:45

Hi -

don't want to side track the discussion. We have 8.4, which of AUTOVACUUM PARAMETERS can be set to handle individual table? I ran into bloat with small table only. Now the issue is being resolved.

Regards

On Wed, Apr 28, 2010 at 10:20 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Rick, 22.04.2010 22:42:

So, in a large table, the scale_factor is the dominant term. In a
small table, the threshold is the dominant term. But both are taken into
account.

The default values are set for small tables; it is not being run for
large tables.

With 8.4 you can adjust the autovacuum settings per table...

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: autovacuum strategy / parameters

From

Kenneth Marshall

Date:

28 April 2010, 11:46:20

Check out the manual:

http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html#AUTOVACUUM

Cheers,
Ken

On Wed, Apr 28, 2010 at 10:37:35AM -0400, akp geek wrote:
> Hi -
>            don't want to side track the discussion. We have 8.4, which of
> AUTOVACUUM PARAMETERS can be set to handle individual table?  I ran into
> bloat with small table only. Now the issue is being resolved.
>
> Regards
> On Wed, Apr 28, 2010 at 10:20 AM, Thomas Kellerer <spam_eater@gmx.net>wrote:
>
> > Rick, 22.04.2010 22:42:
> >
> >
> >> So, in a large table, the scale_factor is the dominant term. In a
> >> small table, the threshold is the dominant term. But both are taken into
> >> account.
> >>
> >> The default values are set for small tables; it is not being run for
> >> large tables.
> >>
> >
> > With 8.4 you can adjust the autovacuum settings per table...
> >
> >
> >
> >
> >
> > --
> > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance
> >

Re: autovacuum strategy / parameters

From

Thomas Kellerer

Date:

28 April 2010, 11:46:24

akp geek, 28.04.2010 16:37:
> We have 8.4, which of AUTOVACUUM PARAMETERS can be set to handle individual table?

All documented here:
http://www.postgresql.org/docs/current/static/sql-createtable.html

Re: autovacuum strategy / parameters

From

Josh Berkus

Date:

30 April 2010, 19:51:10

> My guess is that the reason we run ANALYZE more frequently than vacuum
> (with the default settings) is that ANALYZE is pretty cheap.  In many
> cases, if the statistical distribution of the data hasn't changed
> much, then it's not really necessary, but it doesn't cost much either.
>  And for certain types of usage patterns, like time series (where the
> maximum value keeps increasing) it's REALLY important to analyze
> frequently.
>
> But having said that, on the systems I've worked with, I've only
> rarely seen a problem caused by not analyzing frequently enough.  On
> the other hand, I've seen MANY problems caused by not vacuuming
> enough.

Which is the opposite of my experience; currently we have several
clients who have issues which required more-frequent analyzes on
specific tables.   Before 8.4, vacuuming more frequently, especially on
large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
with 8.4 it's not something you want to increase without thinking about
the tradeoffs.

Since I'm responsible for the current defaults, I though I'd explain the
reasoning behind them.  I developed and tested them while at Greenplum,
so they are *not* designed for small databases.

#autovacuum_vacuum_threshold = 50
#autovacuum_analyze_threshold = 50

These two are set to the minimum threshold to avoid having small tables
get vacuum/analyzed continuously, but to make sure that small tables do
get vacuumed & analyzed sometimes.

#autovacuum_vacuum_scale_factor = 0.2

This is set because in my experience, 20% bloat is about the level at
which bloat starts affecting performance; thus, we want to vacuum at
that level but not sooner.  This does mean that very large tables which
never have more than 10% updates/deletes don't get vacuumed at all until
freeze_age; this is a *good thing*. VACUUM on large tables is expensive;
you don't *want* to vacuum a billion-row table which has only 100,000
updates.

#autovacuum_analyze_scale_factor = 0.1

The 10% threshold for analyze is there because (a) analyze is cheap, and
(b) 10% changes to a table can result in very bad plans if the changes
are highly skewed towards a specific range, such as additions onto the
end of a time-based table.

The current postgres defaults were tested on DBT2 as well as pgbench,
and in my last 2 years of consulting I've seldom found reason to touch
them except on *specific* tables.  So I still feel that they are good
defaults.

It would be worth doing a DBT2/DBT5 test run with different autovac
settings post-8.4 so see if we should specifically change the vacuum
threshold.  Pending that, though, I think the current defaults are good
enough.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: autovacuum strategy / parameters

From

Alvaro Herrera

Date:

01 May 2010, 00:08:39

Josh Berkus escribió:

> #autovacuum_vacuum_scale_factor = 0.2
>
> This is set because in my experience, 20% bloat is about the level at
> which bloat starts affecting performance; thus, we want to vacuum at
> that level but not sooner.  This does mean that very large tables which
> never have more than 10% updates/deletes don't get vacuumed at all until
> freeze_age; this is a *good thing*. VACUUM on large tables is expensive;
> you don't *want* to vacuum a billion-row table which has only 100,000
> updates.

Hmm, now that we have partial vacuum, perhaps we should revisit this.


> It would be worth doing a DBT2/DBT5 test run with different autovac
> settings post-8.4 so see if we should specifically change the vacuum
> threshold.

Right.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: autovacuum strategy / parameters

From

Robert Haas

Date:

01 May 2010, 08:39:17

On Fri, Apr 30, 2010 at 6:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Which is the opposite of my experience; currently we have several
> clients who have issues which required more-frequent analyzes on
> specific tables.

That's all fine, but probably not too relevant to the original
complaint - the OP backed off the default settings by several orders
of magnitude, which might very well cause a problem with both VACUUM
and ANALYZE.

I don't have a stake in the ground on what the right settings are, but
I think it's fair to say that if you vacuum OR analyze much less
frequently than what we recommend my default, it might break.

...Robert

Re: autovacuum strategy / parameters

From

Scott Marlowe

Date:

01 May 2010, 13:08:44

On Wed, Apr 28, 2010 at 8:20 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:
> Rick, 22.04.2010 22:42:
>>
>> So, in a large table, the scale_factor is the dominant term. In a
>> small table, the threshold is the dominant term. But both are taken into
>> account.
>>
>> The default values are set for small tables; it is not being run for
>> large tables.
>
> With 8.4 you can adjust the autovacuum settings per table...

You can as well with 8.3, but it's not made by alter table but by
pg_autovacuum table entries.

Re: autovacuum strategy / parameters

From

Scott Marlowe

Date:

01 May 2010, 13:13:35

On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Which is the opposite of my experience; currently we have several
> clients who have issues which required more-frequent analyzes on
> specific tables.   Before 8.4, vacuuming more frequently, especially on
> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
> with 8.4 it's not something you want to increase without thinking about
> the tradeoff

Actually I would think that statement would be be that before 8.3
vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
had to do with moving FSM to disk, making seldom vacuumed tables
easier to keep track of, and making autovac work better in the
presence of long running transactions.  The ability to tune IO load
etc was basically unchanged in 8.4.

Re: autovacuum strategy / parameters

From

Greg Smith

Date:

01 May 2010, 14:11:50

Robert Haas wrote:
> I don't have a stake in the ground on what the right settings are, but
> I think it's fair to say that if you vacuum OR analyze much less
> frequently than what we recommend my default, it might break.
>

I think the default settings are essentially minimum recommended
frequencies.  They aren't too terrible for the giant data warehouse case
Josh was suggesting they came from--waiting until there's 20% worth of
dead stuff before kicking off an intensive vacuum is OK when vacuum is
expensive and you're mostly running big queries anyway.  And for smaller
tables, the threshold helps it kick in a little earlier.  It's unlikely
anyone wants to *increase* those, so that autovacuum runs even less; out
of the box it's not tuned to run very often at all.

If anything, I'd expect people to want to increase how often it runs,
for tables where much less than 20% dead is a problem.  The most common
situation I've seen where that's the case is when you have a hotspot of
heavily updated rows in a large table, and this may match some of the
situations that Robert was alluding to seeing.  Let's say you have a big
table where 0.5% of the users each update their respective records
heavily, averaging 30 times each.  That's only going to result in 15%
dead rows, so no autovacuum.  But latency for those users will suffer
greatly, because they might have to do lots of seeking around to get
their little slice of the data.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us

Re: autovacuum strategy / parameters

From

Tom Lane

Date:

01 May 2010, 14:26:21

Greg Smith <greg@2ndquadrant.com> writes:
> If anything, I'd expect people to want to increase how often it runs,
> for tables where much less than 20% dead is a problem.  The most common
> situation I've seen where that's the case is when you have a hotspot of
> heavily updated rows in a large table, and this may match some of the
> situations that Robert was alluding to seeing.  Let's say you have a big
> table where 0.5% of the users each update their respective records
> heavily, averaging 30 times each.  That's only going to result in 15%
> dead rows, so no autovacuum.  But latency for those users will suffer
> greatly, because they might have to do lots of seeking around to get
> their little slice of the data.

With a little luck, HOT will alleviate that case, since HOT updates can
be reclaimed without running vacuum per se.  I agree there's a risk
there though.

Now that partial vacuum is available, it'd be a real good thing to
revisit these numbers.

            regards, tom lane

Re: autovacuum strategy / parameters

From

Robert Haas

Date:

01 May 2010, 16:08:59

On Sat, May 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Which is the opposite of my experience; currently we have several
>> clients who have issues which required more-frequent analyzes on
>> specific tables.   Before 8.4, vacuuming more frequently, especially on
>> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
>> with 8.4 it's not something you want to increase without thinking about
>> the tradeoff
>
> Actually I would think that statement would be be that before 8.3
> vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
> had to do with moving FSM to disk, making seldom vacuumed tables
> easier to keep track of, and making autovac work better in the
> presence of long running transactions.  The ability to tune IO load
> etc was basically unchanged in 8.4.

What about http://www.postgresql.org/docs/8.4/static/storage-vm.html ?

...Robert

Re: autovacuum strategy / parameters

From

Scott Marlowe

Date:

01 May 2010, 16:17:22

On Sat, May 1, 2010 at 1:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, May 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> Which is the opposite of my experience; currently we have several
>>> clients who have issues which required more-frequent analyzes on
>>> specific tables.   Before 8.4, vacuuming more frequently, especially on
>>> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
>>> with 8.4 it's not something you want to increase without thinking about
>>> the tradeoff
>>
>> Actually I would think that statement would be be that before 8.3
>> vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
>> had to do with moving FSM to disk, making seldom vacuumed tables
>> easier to keep track of, and making autovac work better in the
>> presence of long running transactions.  The ability to tune IO load
>> etc was basically unchanged in 8.4.
>
> What about http://www.postgresql.org/docs/8.4/static/storage-vm.html ?

That really only has an effect no tables that aren't updated very
often.  Unless you've got a whole bunch of those, it's not that big of
a deal.

Re: autovacuum strategy / parameters

From

Scott Marlowe

Date:

01 May 2010, 16:19:55

On Sat, May 1, 2010 at 1:17 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sat, May 1, 2010 at 1:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, May 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>> On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>>> Which is the opposite of my experience; currently we have several
>>>> clients who have issues which required more-frequent analyzes on
>>>> specific tables.   Before 8.4, vacuuming more frequently, especially on
>>>> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
>>>> with 8.4 it's not something you want to increase without thinking about
>>>> the tradeoff
>>>
>>> Actually I would think that statement would be be that before 8.3
>>> vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
>>> had to do with moving FSM to disk, making seldom vacuumed tables
>>> easier to keep track of, and making autovac work better in the
>>> presence of long running transactions.  The ability to tune IO load
>>> etc was basically unchanged in 8.4.
>>
>> What about http://www.postgresql.org/docs/8.4/static/storage-vm.html ?
>
> That really only has an effect no tables that aren't updated very
> often.  Unless you've got a whole bunch of those, it's not that big of
> a deal.

sigh, s/ no / on /

Anyway, my real point was that the big improvements that made vacuum
so much better came in 8.3, with HOT updates and multi-threaded vacuum
(that might have shown up in 8.2 even) 8.3 was a huge improvement and
compelling upgrade from 8.1 for me.

Re: autovacuum strategy / parameters

From

Robert Haas

Date:

01 May 2010, 16:20:12

On Sat, May 1, 2010 at 1:11 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Robert Haas wrote:
>>
>> I don't have a stake in the ground on what the right settings are, but
>> I think it's fair to say that if you vacuum OR analyze much less
>> frequently than what we recommend my default, it might break.
>>
>
> I think the default settings are essentially minimum recommended
> frequencies.  They aren't too terrible for the giant data warehouse case
> Josh was suggesting they came from--waiting until there's 20% worth of dead
> stuff before kicking off an intensive vacuum is OK when vacuum is expensive
> and you're mostly running big queries anyway.  And for smaller tables, the
> threshold helps it kick in a little earlier.  It's unlikely anyone wants to
> *increase* those, so that autovacuum runs even less; out of the box it's not
> tuned to run very often at all.
>
> If anything, I'd expect people to want to increase how often it runs, for
> tables where much less than 20% dead is a problem.  The most common
> situation I've seen where that's the case is when you have a hotspot of
> heavily updated rows in a large table, and this may match some of the
> situations that Robert was alluding to seeing.  Let's say you have a big
> table where 0.5% of the users each update their respective records heavily,
> averaging 30 times each.  That's only going to result in 15% dead rows, so
> no autovacuum.  But latency for those users will suffer greatly, because
> they might have to do lots of seeking around to get their little slice of
> the data.

For me it's more that my applications are typically really fast, and
when they run at half-speed people think "oh, it's slow today" but
they can still work and attribute the problem to their computer, or
the network, or something.  When they slow down by like 10x then they
file a bug.  I'm typically dealing with a situation where the whole
database can be easily cached in RAM and the CPU is typically 90%
idle, which cushions the blow quite a bit.

A few months ago someone reported that "the portal was slow" and the
problem turned out to be that the database was bloated by in excess of
a factor a factor of 10 due to having blown out the free space map.  I
wasn't a regular user of that system at that time so hadn't had the
opportunity to notice myself.

...Robert