Re: New GUC autovacuum_max_threshold ? - Mailing list pgsql-hackers

From Robert Treat
Subject Re: New GUC autovacuum_max_threshold ?
Date
Msg-id CABV9wwNxNTEgq0n5DJVDD+mgMbqBJv1-W6R3Dfry4a9C9=z=Uw@mail.gmail.com
Whole thread Raw
In response to Re: New GUC autovacuum_max_threshold ?  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Wed, Jan 8, 2025 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Jan 08, 2025 at 02:48:10PM +0100, Frédéric Yhuel wrote:
> > For what it's worth, although I would have preferred the sub-linear growth
> > thing, I'd much rather have this than nothing.
>
> +1, this is how I feel, too.  But I also don't want to add something that
> folks won't find useful.
>
> > And I have to admit that the proposed formulas were either too convoluted or
> > wrong.
> >
> > This very patch is more straightforward. Please let me know if I can help
> > and how.
>
> I read through the thread from the top, and it does seem like there is
> reasonably strong support for the hard cap.  Upon a closer review of the
> patch, I noticed that the relopt was defined such that you couldn't disable
> autovacuum_max_threshold on a per-table basis, so I fixed that in v4.
>

To be frank, this patch feels like a solution in search of a problem,
and as I read back through the thread, it isn't clear what problem
this is intended to fix.

There is some talk of "simplifying" autovacuum configuration, but some
noted that we already have a rather complex set of GUCs to deal with,
and adding another one, along with more math, into the equation
doesn't seem simpler to mel I'd like to think the bar should be that
the problem should be clear. So what is the problem?

Is the patch supposed to help with wraparound prevention?
autovac_freeze_max_age already covers that, and when it doesn't
vacuum_failsafe_age helps out.

A couple of people mentioned issues around hitting the index wall when
vacuuming large tables, but we believe that problem is mostly resolved
due to radix based tid storage, so this doesn't solve that. (To the
degree you don't think v17 has baked into enough production workloads
to be sure, I'd agree, but that's also an argument against doing more
work that might not be needed)

Maybe the hope is that this setting will cause vacuum to run more
often to help ameliorate i/o work from freeze vacuums kicking in, but
I suspect that Melanie's nearby work on eager vacuuming is a smarter
solution towards this problem (warning, it also may want to add more
gucs), so I think we're not solving that, and in fact might be
undercutting it.

I guess that means this is supposed to help with bloat management? but
only on large tables? I guess because you run vacuums more often?
Except that the adages of running vacuums more often don't apply as
cleanly to large tables, because those tables typically come with
large indexes, and while we have a lot of machinery in place to help
with repeated scans of the heap, that same machinery doesn't exist for
scanning the indexes, which gives you sort of an exponential curve
around vacuum times as table size (but actually index size) grows
larger. On the upside, this does mean we're less likely to see a 50x
boost in vacuums on large tables that some seemed concerned about, but
on the downside its because we're probably going to increase the
probability of vacuum worker starvation.

But getting back to goals, if your goal is to help with bloat
management, trying to tie that to a number that doesn't cleanly map to
the meta information of the table in question is a poor way to do it.
Meaning, to the degree that you are skeptical that vacuuming based on
20% of the rows of a table might not really be 20% of the size of the
table, it's certainly going to be a closer map than 100million rows in
a n number of tables of unknown (but presumably greater than
500million?) numbers of rows of unknown sizes. And again, we have a
means to tackle these bloat cases already; lowering
vacuum_scale_factor.

This isn't to say the system is perfect; I do think there are some
fundamental issues that need addressing, but adding this guc just
feels a little less baked than usual.

Robert Treat
https://xzilla.net



pgsql-hackers by date:

Previous
From: James Hunter
Date:
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.
Next
From: Andres Freund
Date:
Subject: Re: AIO v2.2