On Thu, Mar 19, 2026, at 11:49 AM, Nathan Bossart wrote:
> On Thu, Mar 19, 2026 at 09:49:34AM -0400, Greg Burd wrote:
>> My concern isn't that wraparound vacuums are inherently alarming, I agree
>> with you that reaching freeze_max_age isn't a crisis. The issue is a
>> scoring-scale problem in the gap between freeze_max_age (200M) and
>> failsafe age (1.6B).
>>
>> In that 1.4B XID window, force_vacuum tables have XID scores of 1.0–8.0
>> (age/freeze_max_age), while typical active tables accumulate dead-tuple
>> scores of 18–70+ within hours of their last vacuum. The exponential boost
>> doesn't activate until failsafe age, so force_vacuum tables are
>> systematically outranked by routine bloat cleanup for what could be days
>> or weeks in production.
>
> I think "systematically outranked" makes the problem sound worse than it
> is. Once the freeze age is reached, the table is going to get added to the
> list no matter what, it just might be sorted lower.
Yeah, that was a bit of hyperbole on my part. :)
>>> Having said that, I'd not realised that Nathan capped the new GUCs at
>>> 1.0. I think we should allow those to be set higher, likely at least
>>> to 10.0.
>>
>> That would definitely help. If autovacuum_freeze_score_weight could be
>> set to 8.0–10.0, DBAs could manually restore the priority we want.
>
> Done in the attached.
+1
>>> Maybe we could consider adjusting the code that's setting the
>>> xid_score/mxid_score so that we start scaling the score aggressively
>>> when if (xid_age >= effective_xid_failsafe_age /
>>> Max(autovacuum_freeze_score_weight,1.0)) becomes true
>>
>> This is clever, it would make the aggressive scaling kick in earlier when
>> the weight is higher. At weight=8.0, you'd get exponential boost starting
>> at 200M (failsafe/8) instead of 1.6B.
>
> Seems reasonable. I've added this, too.
+1
> Something else we might want to
> consider is scaling the score once the freeze age is reached, just much
> less aggressively than we do at the failsafe age. It probably doesn't make
> sense to start scaling too much at 200M, but at 1.5B, yeah, we should
> probably process the table sooner than later.
So a scaling factor relative to some point like 200M? Maybe... but for now I think what you have in v13 is about right
anda solid improvement over what's there now.
> --
> nathan
>
> Attachments:
> * v13-0001-autovacuum-scheduling-improvements.patch
LGTM!
best.
-greg