Thread: Re: [PERFORM] More detail on settings for pgavd?

Re: [PERFORM] More detail on settings for pgavd?

From
Shridhar Daithankar
Date:
Josh Berkus wrote:

> Shridhar,
>
> I was looking at the -V/-v and -A/-a settings in pgavd, and really don't
> understand how the calculation works.   According to the readme, if I set -v
> to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would
> only vacuum after 21,000 rows had been updated.   This seems wrong.

No. that is correct.

It is calculated as

threshold = base + scale*numebr of current rows

Which translates to

21,000 = 1000 + 2*1000

However I do not agree with this logic entirely. It pegs the next vacuum w.r.t
current table size which is not always a good thing.

I would rather vacuum the table at 2000 updates, which is what you probably want.

Furthermore analyze threshold depends upon inserts+updates. I think it should
also depends upon deletes for obvious reasons.

> Can you clear this up a little?   I'd like to tweak these settings but can't
> without being better aquainted with the calculation.

What did you expected in above example? It is not difficult to tweak
pg_autovacuum calculations. For testing we can play around.

> Also, you may want to reverse your default ratio for Vacuum/analyze frequency.
> True, analyze is a less expensive operation than Vacuum, but it's also needed
> less often -- only when the *distribution* of data changes.    I've seen
> databases where the optimal vacuum/analyze frequency was every 10 min/once
> per day.

OK vacuum and analyze thresholds are calculated with same formula as shown above
  but with different parameters as follows.

vacthresh = vacbase + vacscale*ntuples
anathresh = anabase + anascale*ntuples

What you are asking for is

vacthresh = vacbase*vacscale
anathresh = anabase + anascale*ntuples

Would that tilt the favour the way you want? i.e. an analyze is triggered when a
fixed *percentage* of table changes but a vacuum is triggered when a fixed
*number of rows* are changed.

I am all for experimentation. If you have real life data to play with, I can
give you some patches to play around.

And BTW, this is all brain child of Mathew O.Connor(Correct? I am not good at
either names or spellings). The way I wrote pgavd originally, each table got to
get separate threshold..:-). That was rather a brute force approach.

  Shridhar






Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Shridhar,

> However I do not agree with this logic entirely. It pegs the next vacuum
> w.r.t current table size which is not always a good thing.

No, I think the logic's fine, it's the numbers which are wrong.   We want to
vacuum when updates reach between 5% and 15% of total rows.   NOT when
updates reach 110% of total rows ... that's much too late.

Hmmm ... I also think the threshold level needs to be lowered; I guess the
purpose was to prevent continuous re-vacuuuming of small tables?
Unfortunately, in the current implementation, the result is tha small tables
never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default
calculation for a table with 10,000 rows is:

100 +  ( 0.1 * 10,000 ) = 1100 rows.

> I would rather vacuum the table at 2000 updates, which is what you probably
> want.

Not necessarily.  This would be painful if the table has 10,000,000 rows.   It
*should* be based on a % of rows.

> Furthermore analyze threshold depends upon inserts+updates. I think it
> should also depends upon deletes for obvious reasons.

Yes.  Vacuum threshold is counting deletes, I hope?

> What did you expected in above example? It is not difficult to tweak
> pg_autovacuum calculations. For testing we can play around.

Can I set the settings to decimals, or are they integers?

> vacthresh = vacbase*vacscale
> anathresh = anabase + anascale*ntuples

Nope, see above.

My comment about the frequency of vacuums vs. analyze is that currently the
*default* is to analyze twice as often as you vacuum.    Based on my
experiece as a PG admin on a variety of databases, I believe that the default
should be to analyze half as often as you vacuum.

> I am all for experimentation. If you have real life data to play with, I
> can give you some patches to play around.

I will have real data very soon .....

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] More detail on settings for pgavd?

From
Shridhar Daithankar
Date:
Josh Berkus wrote:

> Shridhar,
  >>However I do not agree with this logic entirely. It pegs the next vacuum
>>w.r.t current table size which is not always a good thing.
>
>
> No, I think the logic's fine, it's the numbers which are wrong.   We want to
> vacuum when updates reach between 5% and 15% of total rows.   NOT when
> updates reach 110% of total rows ... that's much too late.

Well, looks like thresholds below 1 should be norm rather than exception.

> Hmmm ... I also think the threshold level needs to be lowered; I guess the
> purpose was to prevent continuous re-vacuuuming of small tables?
> Unfortunately, in the current implementation, the result is tha small tables
> never get vacuumed at all.
>
> So for defaults, I would peg -V at 0.1 and -v at 100, so our default
> calculation for a table with 10,000 rows is:
>
> 100 +  ( 0.1 * 10,000 ) = 1100 rows.

I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
thresholds less than 1 should be used.

>>Furthermore analyze threshold depends upon inserts+updates. I think it
>>should also depends upon deletes for obvious reasons.
> Yes.  Vacuum threshold is counting deletes, I hope?

It does.

> My comment about the frequency of vacuums vs. analyze is that currently the
> *default* is to analyze twice as often as you vacuum.    Based on my
> experiece as a PG admin on a variety of databases, I believe that the default
> should be to analyze half as often as you vacuum.

OK.

>>I am all for experimentation. If you have real life data to play with, I
>>can give you some patches to play around.
> I will have real data very soon .....

I will submit a patch that would account deletes in analyze threshold. Since you
want to delay the analyze, I would calculate analyze count as

n=updates + inserts *-* deletes

Rather than current "n = updates + inserts". Also update readme about examples
and analyze frequency.

What does statistics gather BTW? Just number of rows or something else as well?
I think I would put that on Hackers separately.

I am still wary of inverting vacuum analyze frequency. You think it is better to
set inverted default rather than documenting it?

  Shridhar


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Shridhar Daithankar wrote:

> Josh Berkus wrote:
>
>> Shridhar,
>
>  >>However I do not agree with this logic entirely. It pegs the next 
> vacuum
>
>>> w.r.t current table size which is not always a good thing.
>>
Ok, what do you recommend?  The point of two separate variables allows 
you to specify if you want vacuum based on a fixed number, based on 
table size or something inbetween.

>>
>> No, I think the logic's fine, it's the numbers which are wrong.   We 
>> want to vacuum when updates reach between 5% and 15% of total rows.   
>> NOT when updates reach 110% of total rows ... that's much too late.
>
For small tables,  you don't need to vacuum too often.  In the testing I 
did a small table ~100 rows, didn't really show significant performance 
degredation until it had close to 1000 updates.  For large tables, 
vacuum is so expensive, that you don't want to do it very often, and  
scanning the whole table when there is only 5% wasted space is not very 
helpful.

>> Hmmm ... I also think the threshold level needs to be lowered; I 
>> guess the purpose was to prevent continuous re-vacuuuming of small 
>> tables?  Unfortunately, in the current implementation, the result is 
>> tha small tables never get vacuumed at all.
>>
>> So for defaults, I would peg -V at 0.1 and -v at 100, so our default 
>> calculation for a table with 10,000 rows is:
>>
>> 100 +  ( 0.1 * 10,000 ) = 1100 rows.
>
Yes, the I set the defaults a little high perhaps so as to err on the 
side of caution.  I didn't want people to say pg_autovacuum kills the 
performance of my server.  A small table will get vacuumed, just not 
until it has reached the threshold.  So a table with 100 rows, will get 
vacuumed after 1200 updates / deletes.  In my testing it showed that 
there was no major performance problems  until you reached several 
thousand updates / deletes.

>>> Furthermore analyze threshold depends upon inserts+updates. I think it
>>> should also depends upon deletes for obvious reasons.
>>
>> Yes.  Vacuum threshold is counting deletes, I hope?
>
> It does.
>
>> My comment about the frequency of vacuums vs. analyze is that 
>> currently the *default* is to analyze twice as often as you 
>> vacuum.    Based on my experiece as a PG admin on a variety of 
>> databases, I believe that the default should be to analyze half as 
>> often as you vacuum.
>
HUH?  analyze is very very cheap compared to vacuum.  Why not do it more 
often?

>>> I am all for experimentation. If you have real life data to play 
>>> with, I
>>> can give you some patches to play around.
>>
>> I will have real data very soon .....
>
> I will submit a patch that would account deletes in analyze threshold. 
> Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.

> I am still wary of inverting vacuum analyze frequency. You think it is 
> better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.  

What I think I am hearing is that people would like very much to be able 
to tweak the settings of pg_autovacuum for individual tables / databases 
etc.  So that you could set certain tables to be vacuumed more 
agressivly than others.  I agree this would be a good and welcome 
addition.  I hope have time to work on this at some point, but in the 
near future I won't.

Matthew




Re: [PERFORM] More detail on settings for pgavd?

From
Shridhar Daithankar
Date:
On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
> Shridhar Daithankar wrote:
> > I will submit a patch that would account deletes in analyze threshold.
> > Since you want to delay the analyze, I would calculate analyze count as
>
> deletes are already accounted for in the analyze threshold.

Yes. My bad. Deletes are not accounted in initializing analyze count but later
they are used.

> > I am still wary of inverting vacuum analyze frequency. You think it is
> > better to set inverted default rather than documenting it?
>
> I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
for you?..:-)

Matthew, I am confyused about one thing. Why would autovacuum count updates
while checking for analyze threshold? Analyze does not change statistics
right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only
inserts+deletes should suffice, isn't it?

Other than that, I think autovacuum does everything it can.

Comments?

 Shridhar

Re: [PERFORM] More detail on settings for pgavd?

From
Shridhar Daithankar
Date:
On Thursday 20 November 2003 20:29, Shridhar Daithankar wrote:
> On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
> > Shridhar Daithankar wrote:
> > > I will submit a patch that would account deletes in analyze threshold.
> > > Since you want to delay the analyze, I would calculate analyze count as
> >
> > deletes are already accounted for in the analyze threshold.
>
> Yes. My bad. Deletes are not accounted in initializing analyze count but
> later they are used.
>
> > > I am still wary of inverting vacuum analyze frequency. You think it is
> > > better to set inverted default rather than documenting it?
> >
> > I think inverting the vacuum and analyze frequency is wrong.
>
> Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
> for you?..:-)

use this one. A warning added for too aggressive vacuumming. If it is OK by
everybody, we can send it to patches list.

 Shridhar

Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Shridhar Daithankar wrote:

>On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
>  
>
>>Shridhar Daithankar wrote:
>>    
>>
>>>I am still wary of inverting vacuum analyze frequency. You think it is
>>>better to set inverted default rather than documenting it?
>>>      
>>>
>>I think inverting the vacuum and analyze frequency is wrong.
>>    
>>
>Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient 
>for you?..:-)
>  
>
The patch just adds an example to the README, this looks ok to me.

>Matthew, I am confyused about one thing. Why would autovacuum count updates 
>while checking for analyze threshold? Analyze does not change statistics 
>right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only 
>inserts+deletes should suffice, isn't it?
>  
>
An update is the equivelant of an insert and a delete, so it counts 
towards the analyze count as much as an insert.

>Other than that, I think autovacuum does everything it can.
>  
>
It could be more customizable.




Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Matthew,

> For small tables,  you don't need to vacuum too often.  In the testing I
> did a small table ~100 rows, didn't really show significant performance
> degredation until it had close to 1000 updates.

This is accounted for by using the "threshold" value.   That way small tables
get vacuumed less often. However, the way large tables work is very different
and I think your strategy shows a lack of testing on large active tables.

> For large tables,
> vacuum is so expensive, that you don't want to do it very often, and
> scanning the whole table when there is only 5% wasted space is not very
> helpful.

5% is probably too low, you're right ... in my experience, performance
degredation starts to set in a 10-15% updates to, for example, a 1.1 million
row table, particularly since users tend to request the most recently updated
rows.   As long as we have the I/O issues that Background Writer and ARC are
intended to solve, though, I can see being less agressive on the defaults;
perhaps 20% or 25%.   If you wait until 110% of a 1.1 million row table is
updated, though, that vaccuum will take an hour or more.

Additionally, you are not thinking of this in terms of an overall database
maintanence strategy.   Lazy Vacuum needs to stay below the threshold of the
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
your databases.   With proper configuration of pg_avd, vacuum_mem and FSM
values, it should be possible to never run a VACUUM FULL again, and as of 7.4
never run an REINDEX again either.

But this means running vacuum frequently enough that your max_fsm_pages
threshold is never reached.   Which for a large database is going to have to
be more frequently than 110% updates, because setting 20,000,000
max_fsm_pages will eat your RAM.

> Yes, the I set the defaults a little high perhaps so as to err on the
> side of caution.  I didn't want people to say pg_autovacuum kills the
> performance of my server.  A small table will get vacuumed, just not
> until it has reached the threshold.  So a table with 100 rows, will get
> vacuumed after 1200 updates / deletes.

Ok, I can see that for small tables.

> In my testing it showed that
> there was no major performance problems  until you reached several
> thousand updates / deletes.

Sure.  But several thousand updates can be only 2% of a very large table.

> HUH?  analyze is very very cheap compared to vacuum.  Why not do it more
> often?

Because nothing is cheap if it's not needed.

Analyze is needed only as often as the *aggregate distribution* of data in the
tables changes.   Depending on the application, this could be frequently, but
far more often (in my experience running multiple databases for several
clients) the data distribution of very large tables changes very slowly over
time.

One client's database, for example, that I have running VACUUM on chron
scripts  runs on this schedule for the main tables:
VACUUM only: twice per hour
VACUUM ANALYZE: twice per day

On the other hand, I've another client's database where most activity involves
updates to entire classes of records.   They run ANALYZE at the end of every
transaction.

So if you're going to have a seperate ANALYZE schedule at all, it should be
slightly less frequent than VACUUM for large tables.   Either that, or drop
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
of having 2 seperate schedules.

BUT .... now I see how you arrived at the logic you did.  If you're testing
only on small tables, and not vacuuming them until they reach 110% updates,
then you *would* need to analyze more frequently.     This is because of your
threshold value ... you'd want to analyze the small table as soon as even 30%
of its rows changed.

So the answer is to dramatically lower the threshold for the small tables.

> What I think I am hearing is that people would like very much to be able
> to tweak the settings of pg_autovacuum for individual tables / databases
> etc.

Not from me you're not.   Though that would be nice, too.

So, my suggested defaults based on our conversation above:

Vacuum threshold: 1000 records
Vacuum scale factor:  0.2
Analyze threshold:  50 records
Analyze scale factor: 0.3

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Shridhar,

> I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that
> thresholds less than 1 should be used.

Yes, but not thresholds, scale factors of less than 1.0.  Thresholds should
still be in the range of 100 to 1000.

> I will submit a patch that would account deletes in analyze threshold.
> Since you want to delay the analyze, I would calculate analyze count as
>
> n=updates + inserts *-* deletes

I'm not clear on how this is a benefit.  Deletes affect the statistics, too.

> What does statistics gather BTW? Just number of rows or something else as
> well? I think I would put that on Hackers separately.

Number of tuples, degree of uniqueness, some sample values, and high/low
values.   Just query your pg_statistics view for an example.

> I am still wary of inverting vacuum analyze frequency. You think it is
> better to set inverted default rather than documenting it?

See my post to Matthew.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Josh Berkus wrote:

>Matthew,
>
>
>>For small tables,  you don't need to vacuum too often.  In the testing I
>>did a small table ~100 rows, didn't really show significant performance
>>degredation until it had close to 1000 updates.
>>
>>
>This is accounted for by using the "threshold" value.   That way small tables
>get vacuumed less often. However, the way large tables work is very different
>and I think your strategy shows a lack of testing on large active tables.
>
>
Probably more true than I would like to think...

>>For large tables,
>>vacuum is so expensive, that you don't want to do it very often, and
>>scanning the whole table when there is only 5% wasted space is not very
>>helpful.
>>
>>
>5% is probably too low, you're right ... in my experience, performance
>degredation starts to set in a 10-15% updates to, for example, a 1.1 million
>row table, particularly since users tend to request the most recently updated
>rows.   As long as we have the I/O issues that Background Writer and ARC are
>intended to solve, though, I can see being less agressive on the defaults;
>perhaps 20% or 25%.   If you wait until 110% of a 1.1 million row table is
>updated, though, that vaccuum will take an hour or more.
>
>
True, but I think it would be one hour once, rather than 30 minutes 4 times.

>Additionally, you are not thinking of this in terms of an overall database
>maintanence strategy.   Lazy Vacuum needs to stay below the threshold of the
>Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
>your databases.   With proper configuration of pg_avd, vacuum_mem and FSM
>values, it should be possible to never run a VACUUM FULL again, and as of 7.4
>never run an REINDEX again either.
>
>
This is one of the things I had hoped to add to pg_autovacuum, but never
got to.  In addition to just the information from the stats collector on
inserts updates and deletes, pg_autovacuum should also look at the FSM,
and make decisions based on it.  Anyone looking for a project?

>But this means running vacuum frequently enough that your max_fsm_pages
>threshold is never reached.   Which for a large database is going to have to
>be more frequently than 110% updates, because setting 20,000,000
>max_fsm_pages will eat your RAM.
>
>
Again, the think the only way to do this efficiently is to look at the
FSM.  Otherwise the only way to make sure you keep the FSM populated is
to run vacuum more than needed.

>>Yes, the I set the defaults a little high perhaps so as to err on the
>>side of caution.  I didn't want people to say pg_autovacuum kills the
>>performance of my server.  A small table will get vacuumed, just not
>>until it has reached the threshold.  So a table with 100 rows, will get
>>vacuumed after 1200 updates / deletes.
>>
>>
>Ok, I can see that for small tables.
>
>
>>In my testing it showed that
>>there was no major performance problems  until you reached several
>>thousand updates / deletes.
>>
>>
>Sure.  But several thousand updates can be only 2% of a very large table.
>
>
But I can't imagine that 2% makes any difference on a large table.  In
fact I would think that 10-15% would hardly be noticable, beyond that
I'm not sure.

>>HUH?  analyze is very very cheap compared to vacuum.  Why not do it more
>>often?
>>
>>
>Because nothing is cheap if it's not needed.
>
>Analyze is needed only as often as the *aggregate distribution* of data in the
>tables changes.   Depending on the application, this could be frequently, but
>far more often (in my experience running multiple databases for several
>clients) the data distribution of very large tables changes very slowly over
>time.
>
>
Valid points, and again I think this points to the fact that
pg_autovacuum needs to be more configurable.  Being able to set
different thresholds for different tables will help considerably.  In
fact, you may find that some tables should have a vac threshold much
larger than the analyze thresold, while other tables might want the
opposite.

>One client's database, for example, that I have running VACUUM on chron
>scripts  runs on this schedule for the main tables:
>VACUUM only: twice per hour
>VACUUM ANALYZE: twice per day
>
>
I would be surprized if you can notice the difference between a vacuum
analyze and a vacuum, especially on large tables.

>On the other hand, I've another client's database where most activity involves
>updates to entire classes of records.   They run ANALYZE at the end of every
>transaction.
>
>So if you're going to have a seperate ANALYZE schedule at all, it should be
>slightly less frequent than VACUUM for large tables.   Either that, or drop
>the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
>of having 2 seperate schedules.
>
>
I think you need two separate schedules.  There are lots of times where
a vacuum doesn't help, and an analyze is all that is needed, and an
analyze is MUCH cheaper than a vacuum.

>BUT .... now I see how you arrived at the logic you did.  If you're testing
>only on small tables, and not vacuuming them until they reach 110% updates,
>then you *would* need to analyze more frequently.     This is because of your
>threshold value ... you'd want to analyze the small table as soon as even 30%
>of its rows changed.
>
>So the answer is to dramatically lower the threshold for the small tables.
>
>
Perhaps.

>>What I think I am hearing is that people would like very much to be able
>>to tweak the settings of pg_autovacuum for individual tables / databases
>>etc.
>>
>>
>Not from me you're not.   Though that would be nice, too.
>
>So, my suggested defaults based on our conversation above:
>
>Vacuum threshold: 1000 records
>Vacuum scale factor:  0.2
>Analyze threshold:  50 records
>Analyze scale factor: 0.3
>
>
I'm open to discussion on changing the defaults.  Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
So that you wound up with something roughly like this:

#tuples   activity% for vacuum
1k           100%
10k           70%
100k         45%
1M            20%
10M          10%
100M          8%

Thanks for the lucid feedback / discussion.  autovacuum is a feature
that, despite it's simple implementation, has generated a lot of
feedback from users, and I would really like to see it become something
closer to what it should be.



Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Matthew,

> > 110% of a 1.1 million row table is updated, though, that vaccuum will
> > take an hour or more.
>
> True, but I think it would be one hour once, rather than 30 minutes 4
> times.

Well, generally it would be about 6-8 times at 2-4 minutes each.

> This is one of the things I had hoped to add to pg_autovacuum, but never
> got to.  In addition to just the information from the stats collector on
> inserts updates and deletes, pg_autovacuum should also look at the FSM,
> and make decisions based on it.  Anyone looking for a project?

Hmmm ... I think that's the wrong approach.  Once your database is populated,
it's very easy to determine how to set the FSM for a given pg_avd level.   If
you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of
the total database pages plus growth & safety margins.

I'd be really reluctant to base pv-avd frequency on the fsm settings instead.
What if the user loads 8GB of data but leaves fsm_pages at the default of
10,000?  You can't do much with that; you'd have to vacuum if even 1% of the
data changed.

The other problem is that calculating data pages from a count of
updates+deletes would require pg_avd to keep more statistics and do more math
for every table.  Do we want to do this?

> But I can't imagine that 2% makes any difference on a large table.  In
> fact I would think that 10-15% would hardly be noticable, beyond that
> I'm not sure.

I've seen performance lag at 10% of records, especially in tables where both
update and select activity focus on one subset of the table (calendar tables,
for example).

> Valid points, and again I think this points to the fact that
> pg_autovacuum needs to be more configurable.  Being able to set
> different thresholds for different tables will help considerably.  In
> fact, you may find that some tables should have a vac threshold much
> larger than the analyze thresold, while other tables might want the
> opposite.

Sure.  Though I think we can make the present configuration work with a little
adjustment of the numbers.   I'll have a chance to test on production
databases soon.

> I would be surprized if you can notice the difference between a vacuum
> analyze and a vacuum, especially on large tables.

It's substantial for tables with high statistics settings.   A 1,000,000 row
table with 5 columns set to statistics=250 can take 3 minutes to analyze on a
medium-grade server.

> I think you need two separate schedules.  There are lots of times where
> a vacuum doesn't help, and an analyze is all that is needed

Agreed.  And I've just talked to a client who may want to use pg_avd's ANALYZE
scheduling but not use vacuum at all.   BTW, I think we should have a setting
for this; for example, if -V is -1, don't vacuum.

> I'm open to discussion on changing the defaults.  Perhaps what it would
> be better to use some non-linear (perhaps logorithmic) scaling factor.
> So that you wound up with something roughly like this:
>
> #tuples   activity% for vacuum
> 1k           100%
> 10k           70%
> 100k         45%
> 1M            20%
> 10M          10%
> 100M          8%

That would be cool, too.    Though a count of data pages would be a better
scale than a count of rows, and equally obtainable from pg_class.

> Thanks for the lucid feedback / discussion.  autovacuum is a feature
> that, despite it's simple implementation, has generated a lot of
> feedback from users, and I would really like to see it become something
> closer to what it should be.

Well, I hope to help now.  Until very recently, I've not had a chance to
seriously look at pg_avd and test it in production.   Now that I do, I'm
interested in improving it.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] More detail on settings for pgavd?

From
Robert Treat
Date:
On Thu, 2003-11-20 at 19:40, Matthew T. O'Connor wrote:
> I'm open to discussion on changing the defaults.  Perhaps what it would
> be better to use some non-linear (perhaps logorithmic) scaling factor.
> So that you wound up with something roughly like this:
>
> #tuples   activity% for vacuum
> 1k           100%
> 10k           70%
> 100k         45%
> 1M            20%
> 10M          10%
> 100M          8%
>


Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults that held what we generally
consider the best default percentages based on reltuples/pages, and
added a column to pg_class (could be some place better but..) which
could hold an overriding percentage, you could then have a column added
to pg_stat_all_tables called vacuum_percentage, which would be a
coalesce of the override percentage or the default percentages based on
rel_tuples (or rel_pages).  This would give autovacuum a place to look
for each table as to when it should vacuum, and gives administrators the
option to tweak it on a per table basis if they find they need a
specific table to vacuum at a different rate than the "standard".

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Robert Treat wrote:

>Just thinking out loud here, so disregard if you think its chaff but...
>if we had a system table pg_avd_defaults
>
[snip]

As long as pg_autovacuum remains a contrib module, I don't think any
changes to the system catelogs will be make.  If  pg_autovacuum is
deemed ready to move out of contrib, then we can talk about the above.


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Josh Berkus wrote:

>Matthew,
>
>
>>True, but I think it would be one hour once, rather than 30 minutes 4
>>times.
>>
>>
>Well, generally it would be about 6-8 times at 2-4 minutes each.
>
>
Are you saying that you can vacuum a 1 million row table in 2-4
minutes?  While a vacuum of the same table with an additional 1 million
dead tuples would take an hour?

>>This is one of the things I had hoped to add to pg_autovacuum, but never
>>got to.  In addition to just the information from the stats collector on
>>inserts updates and deletes, pg_autovacuum should also look at the FSM,
>>and make decisions based on it.  Anyone looking for a project?
>>
>>
>Hmmm ... I think that's the wrong approach.  Once your database is populated,
>it's very easy to determine how to set the FSM for a given pg_avd level.   If
>you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of
>the total database pages plus growth & safety margins.
>
>
Ok.

>I'd be really reluctant to base pv-avd frequency on the fsm settings instead.
>What if the user loads 8GB of data but leaves fsm_pages at the default of
>10,000?  You can't do much with that; you'd have to vacuum if even 1% of the
>data changed.
>
Ok, but as you said above it's very easy to set the FSM once you know
your db size.

>The other problem is that calculating data pages from a count of
>updates+deletes would require pg_avd to keep more statistics and do more math
>for every table.  Do we want to do this?
>
>
I would think the math is simple enough to not be a big problem.  Also,
I did not recommend looking blindly at the FSM as our guide, rather
consulting it as another source of information as to when it would be
useful to vacuum.  I don't have a good plan as to how to incorporate
this data, but to a large extent the FSM already tracks table activity
and gives us the most accurate answer about storage growth (short of
using something like contrib/pgstattuple which takes nearly the same
amount of time as an actual vacuum)

>>But I can't imagine that 2% makes any difference on a large table.  In
>>fact I would think that 10-15% would hardly be noticable, beyond that
>>I'm not sure.
>>
>>
>I've seen performance lag at 10% of records, especially in tables where both
>update and select activity focus on one subset of the table (calendar tables,
>for example).
>
>
Ok.

>>Valid points, and again I think this points to the fact that
>>pg_autovacuum needs to be more configurable.  Being able to set
>>different thresholds for different tables will help considerably.  In
>>fact, you may find that some tables should have a vac threshold much
>>larger than the analyze thresold, while other tables might want the
>>opposite.
>>
>>
>Sure.  Though I think we can make the present configuration work with a little
>adjustment of the numbers.   I'll have a chance to test on production
>databases soon.
>
>
I look forward to hearing results from your testing.

>>I would be surprized if you can notice the difference between a vacuum
>>analyze and a vacuum, especially on large tables.
>>
>>
>It's substantial for tables with high statistics settings.   A 1,000,000 row
>table with 5 columns set to statistics=250 can take 3 minutes to analyze on a
>medium-grade server.
>
>
In my testing, I never changed the default statistics settings.

>>I think you need two separate schedules.  There are lots of times where
>>a vacuum doesn't help, and an analyze is all that is needed
>>
>>
>Agreed.  And I've just talked to a client who may want to use pg_avd's ANALYZE
>scheduling but not use vacuum at all.   BTW, I think we should have a setting
>for this; for example, if -V is -1, don't vacuum.
>
>
That would be nice.  Easy to add, and something I never thought of....

>>I'm open to discussion on changing the defaults.  Perhaps what it would
>>be better to use some non-linear (perhaps logorithmic) scaling factor.
>>
>>
>That would be cool, too.    Though a count of data pages would be a better
>scale than a count of rows, and equally obtainable from pg_class.
>
>
But we track tuples because we can compare against the count given by
the stats system.  I don't know of a way (other than looking at the FSM,
or contrib/pgstattuple ) to see how many dead pages exist.



Re: [PERFORM] More detail on settings for pgavd?

From
Shridhar Daithankar
Date:
Matthew T. O'Connor wrote:

> But we track tuples because we can compare against the count given by
> the stats system.  I don't know of a way (other than looking at the FSM,
> or contrib/pgstattuple ) to see how many dead pages exist.

I think making pg_autovacuum dependent of pgstattuple is very good idea.

Probably it might be a good idea to extend pgstattuple to return pages that are
excessively contaminated and clean them ASAP. Step by step getting closer to
daemonized vacuum.

  Shridhar


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Shridhar Daithankar wrote:

> Matthew T. O'Connor wrote:
>
>> But we track tuples because we can compare against the count given by
>> the stats system.  I don't know of a way (other than looking at the
>> FSM, or contrib/pgstattuple ) to see how many dead pages exist.
>
> I think making pg_autovacuum dependent of pgstattuple is very good idea.

Only if pgstattuple can become much cheaper than it is now.  Based on
the testing I did when I wrote pg_autovacuum, pgstattuple cost nearly
the same amount as a regular vacuum.  Given that, what have we gained
from that work?  Wouldn't it just be better to run a vacuum and actually
reclaim space rather than running pgstattuple, and just look and see if
there is free space to be reclaimed?

Perhaps we could use pgstattuple ocasionally to see if we are going a
good job of keeping the amount of dead space to a reasonable level, but
I'm still not really sure about this.

> Probably it might be a good idea to extend pgstattuple to return pages
> that are excessively contaminated and clean them ASAP. Step by step
> getting closer to daemonized vacuum.

I don't know of anyway to clean a particular set of pages.  This is
something that has been talked about (partial vacuums and such), but I
think Tom has raised several issues with it, I don't remember the
details.  Right now the only tool we have to reclaim space is vacuum, a
whole table at a time.



Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Matthew,

> As long as pg_autovacuum remains a contrib module, I don't think any
> changes to the system catelogs will be make.  If  pg_autovacuum is
> deemed ready to move out of contrib, then we can talk about the above.

But we could create a config file that would store stuff in a flatfile table,
OR we could add our own "system table" that would be created when one
"initializes" pg_avd.

Just an idea.  Mind you, I'm not so sure that we want to focus immediately on
per-table settings.   I think that we want to get the "automatic" settings
working fairly well first; a lot of new DBAs would use the per-table settings
to shoot themselves in the foot.  So we need to be able to make a strong
recommendation to "try the automatic settings first."

> Are you saying that you can vacuum a 1 million row table in 2-4
> minutes?  While a vacuum of the same table with an additional 1 million
> dead tuples would take an hour?

I'm probably exaggerating.  I do know that I can vacuum a fairly clean 1-5
million row table in less than 4 mintues.   I've never let such a table get
to 50% dead tuples, so I don't really know how long that takes.  Call me a
coward if you  like ...

> >I'd be really reluctant to base pv-avd frequency on the fsm settings
> > instead. What if the user loads 8GB of data but leaves fsm_pages at the
> > default of 10,000?  You can't do much with that; you'd have to vacuum if
> > even 1% of the data changed.
>
> Ok, but as you said above it's very easy to set the FSM once you know
> your db size.

Actually, thinking about this I realize that PG_AVD and the Perl-based
postgresql.conf configuration script I was working on (darn, who was doing
that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is
very easy; without it its a bit of guesswork.

So I think we can do this:  for 'auto' settings:

If max_fsm_pages is between 13% and 100% of the total database pages, then set
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15

If max_fsm_pages is less than 13% of database pages, issue a warning to the
user (log it, if possible) and set scale factor to 0.1.   If it's greater
than 100% set it to 1 and leave it alone.

> I don't have a good plan as to how to incorporate
> this data, but to a large extent the FSM already tracks table activity
> and gives us the most accurate answer about storage growth (short of
> using something like contrib/pgstattuple which takes nearly the same
> amount of time as an actual vacuum)

I don't really think we need to do dynamic monitoring at this point.   It
would be a lot of engineering to check data page pollution without having
significant performance impact.   It's doable, but something I think we
should hold off until version 3.  It would mean hacking the FSM, which is a
little beyond me right now.

> In my testing, I never changed the default statistics settings.

Ah.  Well, a lot of users do to resolve query problems.

> But we track tuples because we can compare against the count given by
> the stats system.  I don't know of a way (other than looking at the FSM,
> or contrib/pgstattuple ) to see how many dead pages exist.

No, but for scaling you don't need the dynamic count of tuples or of dead
tuples; pg_class holds a reasonable accurate count of pages per table as of
last vacuum.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Josh Berkus wrote:

>Matthew,
>
>
>But we could create a config file that would store stuff in a flatfile table,
>OR we could add our own "system table" that would be created when one
>"initializes" pg_avd.
>
>
I don't want to add tables to existing databases, as I consider that
clutter and I never like using tools that clutter my production
databases.  I had considered using a pg_autovacuum database that if
found, would store customized settings for individual tables /
databases.  Dunno if this is a  good idea, but it might make a good
stopgap until people are comfortable modifying the system catalogs for
autovacuum.

Actually, this might be a necessary addition as pg_autovacuum currently
suffers from the startup transients that the FSM used to suffer from,
that is, it doesn't remember anything that happened the last time it
ran.  A pg_autovacuum database could also be used to store thresholds
and counts from the last time it ran.

>Just an idea.  Mind you, I'm not so sure that we want to focus immediately on
>per-table settings.   I think that we want to get the "automatic" settings
>working fairly well first; a lot of new DBAs would use the per-table settings
>to shoot themselves in the foot.  So we need to be able to make a strong
>recommendation to "try the automatic settings first."
>
>
I agree in principle, question is what are the best settings, I still
think it will be hard to find a one size fits all, but I'm sure we can
do better than what we have.

>Actually, thinking about this I realize that PG_AVD and the Perl-based
>postgresql.conf configuration script I was working on (darn, who was doing
>that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is
>very easy; without it its a bit of guesswork.
>
>So I think we can do this:  for 'auto' settings:
>
>If max_fsm_pages is between 13% and 100% of the total database pages, then set
>the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
>database = 18,000,000 data pages;
>max_fsm_pages = 3,600,000;
>set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15
>
>
Where are you getting 13% from?  Do you know of an easy way to get a
count of the total pages used by a whole cluster?  I guess we can just
iterate over all the tables in all the databases and sum up the total
num of pages.  We already iterate over them all, we just don't sum it up.

>If max_fsm_pages is less than 13% of database pages, issue a warning to the
>user (log it, if possible) and set scale factor to 0.1.   If it's greater
>than 100% set it to 1 and leave it alone.
>
>
Again I ask where 13% is coming from and also where is 0.1 coming from?
I assume these are your best guesses right now, but not more than that.
I do like the concept though as long as we find good values for
min_fsm_percentage and min_autovac_scaling_factor.

>>But we track tuples because we can compare against the count given by
>>the stats system.  I don't know of a way (other than looking at the FSM,
>>or contrib/pgstattuple ) to see how many dead pages exist.
>>
>>
>No, but for scaling you don't need the dynamic count of tuples or of dead
>tuples; pg_class holds a reasonable accurate count of pages per table as of
>last vacuum.
>
>
Which we already keep a copy of inside of pg_autovacuum, and update
after we issue a vacuum.



Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Matthew,

> Actually, this might be a necessary addition as pg_autovacuum currently
> suffers from the startup transients that the FSM used to suffer from,
> that is, it doesn't remember anything that happened the last time it
> ran.  A pg_autovacuum database could also be used to store thresholds
> and counts from the last time it ran.

I don't see how a seperate database is better than a table in the databases.,
except that it means scanning only one table and not one per database.   For
one thing, making it a seperate database could make it hard to back up and
move your database+pg_avd config.

But I don't feel strongly about it.

> Where are you getting 13% from?

13% * 3/4 ~~ 10%

And I think both of use agree that vacuuming tables with less than 10% changes
is excessive and could lead to problems on its own, like overlapping vacuums.

>  Do you know of an easy way to get a
> count of the total pages used by a whole cluster?

Select sum(relpages) from pg_class.

> I do like the concept though as long as we find good values for
> min_fsm_percentage and min_autovac_scaling_factor.

See above.  I propose 0.13 and 0.1

> Which we already keep a copy of inside of pg_autovacuum, and update
> after we issue a vacuum.

Even easier then.

BTW, do we have any provisions to avoid overlapping vacuums?  That is, to
prevent a second vacuum on a table if an earlier one is still running?

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Josh Berkus wrote:

>Matthew,
>
>
>
>I don't see how a seperate database is better than a table in the databases.,
>except that it means scanning only one table and not one per database.   For
>one thing, making it a seperate database could make it hard to back up and
>move your database+pg_avd config.
>
>
Basically, I don't like the idea of modifying users databases, besides,
in the long run most of what needs to be tracked will be moved to the
system catalogs.  I kind of consider the pg_autvacuum database to
equivalent to the changes that will need to be made to the system catalogs.

I guess it could make it harder to backup if you are moving your
database between clusters.  Perhaps, if you create a pg_autovacuum
schema inside of your database then we would could use that.  I just
don't like tools that drop things into your database.

>>Where are you getting 13% from?
>>
>>
>
>13% * 3/4 ~~ 10%
>
>And I think both of use agree that vacuuming tables with less than 10% changes
>is excessive and could lead to problems on its own, like overlapping vacuums.
>
>
>
I certainly agree that less than 10% would be excessive, I still feel
that 10% may not be high enough though.   That's why I kinda liked the
sliding scale I mentioned earlier, because I agree that for very large
tables, something as low as 10% might be useful, but most tables in a
database would not be that large.

>> Do you know of an easy way to get a
>>count of the total pages used by a whole cluster?
>>
>>
>
>Select sum(relpages) from pg_class.
>
>
>
duh....

>BTW, do we have any provisions to avoid overlapping vacuums?  That is, to
>prevent a second vacuum on a table if an earlier one is still running?
>
>
>
Only that pg_autovacuum isn't smart enough to kick off more than one
vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table
and waits for it to finish, then check the next table in it's list to
see if it needs to be vacuumed, if so, it does it and waits for that
vacuum to finish.   There was some discussion of issuing concurrent
vacuum against different tables, but it was decided that since vacuum is
I/O bound, it would only make sense to issue concurrent vacuums that
were on different spindles, which is not something I wanted to get
into.  Also, given the recent talk about how vacuum is still such a
performance hog, I can't imagine what multiple concurrent vacuums would
do to performance.  Maybe as 7.5 develops and many of the vacuum
performance issues are addressed, we can revisit this question.



Re: [PERFORM] More detail on settings for pgavd?

From
Josh Berkus
Date:
Matthew,

> Basically, I don't like the idea of modifying users databases, besides,
> in the long run most of what needs to be tracked will be moved to the
> system catalogs.  I kind of consider the pg_autvacuum database to
> equivalent to the changes that will need to be made to the system catalogs.

OK.  As I said, I don't feel strongly about it.

> I certainly agree that less than 10% would be excessive, I still feel
> that 10% may not be high enough though.   That's why I kinda liked the
> sliding scale I mentioned earlier, because I agree that for very large
> tables, something as low as 10% might be useful, but most tables in a
> database would not be that large.

Yes, but I thought that we were taking care of that through the "threshold"
value?

A sliding scale would also be OK.   However, that would definitely require a
leap to storing per-table pg_avd statistics and settings.

> Only that pg_autovacuum isn't smart enough to kick off more than one
> vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table
> and waits for it to finish, then check the next table in it's list to
> see if it needs to be vacuumed, if so, it does it and waits for that
> vacuum to finish.

OK, then, we just need to detect the condition of the vacuums "piling up"
because they are happening too often.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: [PERFORM] More detail on settings for pgavd?

From
"Matthew T. O'Connor"
Date:
Josh Berkus wrote:

>Matthew,
>
>
>>I certainly agree that less than 10% would be excessive, I still feel
>>that 10% may not be high enough though.   That's why I kinda liked the
>>sliding scale I mentioned earlier, because I agree that for very large
>>tables, something as low as 10% might be useful, but most tables in a
>>database would not be that large.
>>
>>
>
>Yes, but I thought that we were taking care of that through the "threshold"
>value?
>
>
Well the threshold is a combination of the base value and the scaling
factor which you are proposing is 0.1, so the threshold is base +
(scaling factor)(num of tuples)  So with the default base of 1000 and
your 0.1 you would have this:

 Num Rows    threshold      Percent
    1,000        1,100         110%
   10,000        2,000          20%
  100,000       11,000          11%
1,000,000      102,000          10%

I don't like how that looks, hence the thought of some non-linear
scaling factor that would still allow the percent to reach 10%, but at a
slower rate, perhaps just a larger base value would suffice, but I think
small table performance is going to suffer much above 1000.  Anyone else
have an opinion on the table above? Good / Bad / Indifferent?

>A sliding scale would also be OK.   However, that would definitely require a
>leap to storing per-table pg_avd statistics and settings.
>
>
>
I don't think it would, it would correlate the scaling factor with the
number of tuples, no per-table settings required.

>>Only that pg_autovacuum isn't smart enough to kick off more than one
>>vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table
>>and waits for it to finish, then check the next table in it's list to
>>see if it needs to be vacuumed, if so, it does it and waits for that
>>vacuum to finish.
>>
>>
>
>OK, then, we just need to detect the condition of the vacuums "piling up"
>because they are happening too often.
>
>
>
That would be good to look into at some point, especially if vacuum is
going to get slower as a result of the page loop delay patch that has
been floating around.



Re: [PERFORM] More detail on settings for pgavd?

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> BTW, do we have any provisions to avoid overlapping vacuums?  That is, to
> prevent a second vacuum on a table if an earlier one is still running?

Yes, VACUUM takes a lock that prevents another VACUUM on the same table.

            regards, tom lane

Re: [PERFORM] More detail on settings for pgavd?

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Josh Berkus <josh@agliodbs.com> writes:
> > BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
> > prevent a second vacuum on a table if an earlier one is still running?
> 
> Yes, VACUUM takes a lock that prevents another VACUUM on the same table.

The second vacuum waits for the lock to become available. If the situation got
really bad there could end up being a growing queue of vacuums waiting.

I'm not sure how likely this is as the subsequent vacuums appear to finish
quite quickly though. But then the largest table I have to play with fits
entirely in memory.

-- 
greg



Re: [PERFORM] More detail on settings for pgavd?

From
Alvaro Herrera Munoz
Date:
On Fri, Nov 21, 2003 at 04:24:25PM -0500, Matthew T. O'Connor wrote:

> I don't want to add tables to existing databases, as I consider that
> clutter and I never like using tools that clutter my production
> databases.  [...]
>
> Actually, this might be a necessary addition as pg_autovacuum currently
> suffers from the startup transients that the FSM used to suffer from,
> that is, it doesn't remember anything that happened the last time it
> ran.  A pg_autovacuum database could also be used to store thresholds
> and counts from the last time it ran.

You could use the same approach the FSM uses: keep a file with the data,
PGDATA/base/global/pg_fsm.cache.  You don't need the data to be in a table
after all ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

Re: [PERFORM] More detail on settings for pgavd?

From
Andrew Sullivan
Date:
On Fri, Nov 21, 2003 at 07:51:17PM -0500, Greg Stark wrote:
> The second vacuum waits for the lock to become available. If the
> situation got really bad there could end up being a growing queue
> of vacuums waiting.

Those of us who have run into this know that "the situation got
really bad" is earlier than one might think.  And it can indeed cause
some pretty pathological behaviour.

A


-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: [PERFORM] More detail on settings for pgavd?

From
Chester Kustarz
Date:
On Fri, 21 Nov 2003, Matthew T. O'Connor wrote:
> >> Do you know of an easy way to get a
> >>count of the total pages used by a whole cluster?
> >
> >Select sum(relpages) from pg_class.

You might want to exclude indexes from this calculation. Some large
read only tables might have indexes larger than the tables themselves.