Re: New GUC autovacuum_max_threshold ? - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: New GUC autovacuum_max_threshold ?
Date
Msg-id 20240424195747.GA861150@nathanxps13
Whole thread Raw
In response to Re: New GUC autovacuum_max_threshold ?  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: New GUC autovacuum_max_threshold ?
Re: New GUC autovacuum_max_threshold ?
List pgsql-hackers
On Wed, Apr 24, 2024 at 03:10:27PM -0400, Melanie Plageman wrote:
> On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel
> <frederic.yhuel@dalibo.com> wrote:
>> I would like to suggest a new parameter, autovacuum_max_threshold, which
>> would set an upper limit on the number of tuples to delete/update/insert
>> prior to vacuum/analyze.
> 
> Hi Frédéric, thanks for the proposal! You are tackling a very tough
> problem. I would also find it useful to know more about what led you
> to suggest this particular solution. I am very interested in user
> stories around difficulties with what tables are autovacuumed and
> when.
> 
> Am I correct in thinking that one of the major goals here is for a
> very large table to be more likely to be vacuumed?

If this is indeed the goal, +1 from me for doing something along these
lines.

>> The idea would be to replace the following calculation :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
>>
>> with this one :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples / (1
>> + vac_scale_factor * reltuples / autovacuum_max_threshold)
>>
>> (and the same for the others, vacinsthresh and anlthresh).
> 
> My first thought when reviewing the GUC and how it is used is
> wondering if its description is a bit misleading.

Yeah, I'm having trouble following the proposed mechanics for this new GUC,
and it's difficult to understand how users would choose a value.  If we
just want to cap the number of tuples required before autovacuum takes
action, perhaps we could simplify it to something like

    vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
    vacthresh = Min(vacthres, vac_max_thresh);

This would effectively cause autovacuum_vacuum_scale_factor to be
overridden for large tables where the scale factor would otherwise cause
the calculated threshold to be extremely high.

>> My guess is that a similar proposal has already been submitted... and
>> rejected 🙂 If so, I'm very sorry for the useless noise.
> 
> I rooted around in the hackers archive and couldn't find any threads
> on this specific proposal. I copied some other hackers I knew of who
> have worked on this problem and thought about it in the past, in case
> they know of some existing threads or prior work on this specific
> topic.

FWIW I have heard about this problem in the past, too.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Statistics Import and Export
Next
From: Nathan Bossart
Date:
Subject: Re: Partitioned tables and [un]loggedness