Thread: Adjust autovacuum naptime automatically

Adjust autovacuum naptime automatically

From
ITAGAKI Takahiro
Date:
Hi hackers,

There is a comment in autovacuum.c:
| XXX todo: implement sleep scale factor that existed in contrib code.
and the attached is a patch to implement it.

In contrib code, sleep scale factor was used to adjust naptime only to
lengthen the naptime. But I changed the behavior to be able to shorten it.

In the case of a heavily update workload, the default naptime (60 seconds)
is too long to keep the number of dead tuples low. With my patch, the naptime
will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps)
with default other autovacuum parameters.


I have something that I want to discuss with you:
 - Can we use the process-exitcode to make autovacuum daemon to communicate
   with postmaster? I used it to notify there are any vacuum jobs or not.
 - I removed autovacuum_naptime guc variable, because it is adjusted
   automatically now. Is it appropriate?

Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

Re: Adjust autovacuum naptime automatically

From
Alvaro Herrera
Date:
ITAGAKI Takahiro wrote:

> In the case of a heavily update workload, the default naptime (60 seconds)
> is too long to keep the number of dead tuples low. With my patch, the naptime
> will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps)
> with default other autovacuum parameters.

Interesting.  To be frank I don't know what the sleep scale factor was
supposed to do.

> I have something that I want to discuss with you:
>  - Can we use the process-exitcode to make autovacuum daemon to communicate
>    with postmaster? I used it to notify there are any vacuum jobs or not.

I can only tell you we do this is Mammoth Replicator and it works for
us.  Whether this is a very good idea, I don't know.  I didn't find any
other means to communicate stuff from dying processes to the postmaster.

>  - I removed autovacuum_naptime guc variable, because it is adjusted
>    automatically now. Is it appropriate?

I think we should provide the user with a way to stop the naptime from
changing at all.  Eventually we will have the promised "maintenance
windows" feature which will mean the user will not have to worry at all
about the naptime, but in the meantime I think we should keep it.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: [HACKERS] Adjust autovacuum naptime automatically

From
"Matthew T. O'Connor"
Date:
Alvaro Herrera wrote:
> ITAGAKI Takahiro wrote:
>
>> In the case of a heavily update workload, the default naptime (60 seconds)
>> is too long to keep the number of dead tuples low. With my patch, the naptime
>> will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps)
>> with default other autovacuum parameters.

What is this based on?  That is, based on what information is it
deciding to reduce the naptime?


> Interesting.  To be frank I don't know what the sleep scale factor was
> supposed to do.

I'm not sure that sleep scale factor is a good idea or not at this
point, but what I was thinking back in the day when i originally wrote
the contrib autovacuum is that I didn't want the system to get bogged
down constantly vacuuming.  So, if it just spent a long time working on
one database, it would sleep for long time.

Given that we can now specify the vacuum cost delay settings for
autovacuum and disable tables and everything else, I'm not sure we this
anymore, at least not as it was originally designed.  It sounds like
Itagaki is doing things a little different with his patch, but I'm not
sure I understand it.

>>  - I removed autovacuum_naptime guc variable, because it is adjusted
>>    automatically now. Is it appropriate?
>
> I think we should provide the user with a way to stop the naptime from
> changing at all.  Eventually we will have the promised "maintenance
> windows" feature which will mean the user will not have to worry at all
> about the naptime, but in the meantime I think we should keep it.

I'm not sure that's true.  I believe we will want the naptime GUC option
even after we have the maintenance window.  I think we might ignore the
naptime during the maintenance window, but even after we have the
maintenance window, we will still vacuum during the day as required.

My vision of the maintenance window has always been very simple, that
is, during the maintenance window the thresholds get reduced by some
factor (probably a GUC variable) so during the day it might take 10000
updates on a table to cause a vacuum but during the naptime it might be
10% of that, 1000.  Is this in-line with what others were thinking?


Autovacuum maintenance window (was Re: Adjust autovacuum naptime automatically)

From
Alvaro Herrera
Date:
Matthew T. O'Connor wrote:

> My vision of the maintenance window has always been very simple, that
> is, during the maintenance window the thresholds get reduced by some
> factor (probably a GUC variable) so during the day it might take 10000
> updates on a table to cause a vacuum but during the naptime it might be
> 10% of that, 1000.  Is this in-line with what others were thinking?

My vision is a little more complex than that.  You define group of
tables, and separately you define time intervals.  For each combination
of group and interval you can configure certain parameters, like a
multiplier for the autovacuum thresholds and factors; and also the
"enable" bit.  So you can disable vacuum for some intervals, and refine
the equation factors for some others.  This is all configured in tables,
not in GUC, so you have more flexibility in choosing stuff for different
groups of tables (say, you really want the small-but-high-update tables
to be still vacuumed even during peak periods, but you don't want that
big fat table to be vacuumed at all during the same period).

I had intended to work on this during the code sprint, but got
distracted.  I intend to do it for 8.3 instead.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [HACKERS] Autovacuum maintenance window (was Re: Adjust autovacuum

From
"Matthew T. O'Connor"
Date:
Alvaro Herrera wrote:
> My vision is a little more complex than that.  You define group of
> tables, and separately you define time intervals.  For each combination
> of group and interval you can configure certain parameters, like a
> multiplier for the autovacuum thresholds and factors; and also the
> "enable" bit.  So you can disable vacuum for some intervals, and refine
> the equation factors for some others.  This is all configured in tables,
> not in GUC, so you have more flexibility in choosing stuff for different
> groups of tables (say, you really want the small-but-high-update tables
> to be still vacuumed even during peak periods, but you don't want that
> big fat table to be vacuumed at all during the same period).

That sounds good.  I worry a bit that it's going to get overly complex.
I suppose if we create the concept of a default window that all new
tables will be automatically be added to when created, then out of the
box we can create 1 default 24 hour maintenance window that would
effectively give us the same functionality we have now.

Could we also use these groups to be used for concurrent vacuums?  That
is autovacuum will loop through each group of tables independently thus
allowing multiple simultaneous vacuums on different tables and giving us
a solution to the constantly updated table problem.