Re: New GUC autovacuum_max_threshold ? - Mailing list pgsql-hackers
From | Robert Treat |
---|---|
Subject | Re: New GUC autovacuum_max_threshold ? |
Date | |
Msg-id | CABV9wwNxNTEgq0n5DJVDD+mgMbqBJv1-W6R3Dfry4a9C9=z=Uw@mail.gmail.com Whole thread Raw |
In response to | Re: New GUC autovacuum_max_threshold ? (Nathan Bossart <nathandbossart@gmail.com>) |
List | pgsql-hackers |
On Wed, Jan 8, 2025 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Jan 08, 2025 at 02:48:10PM +0100, Frédéric Yhuel wrote: > > For what it's worth, although I would have preferred the sub-linear growth > > thing, I'd much rather have this than nothing. > > +1, this is how I feel, too. But I also don't want to add something that > folks won't find useful. > > > And I have to admit that the proposed formulas were either too convoluted or > > wrong. > > > > This very patch is more straightforward. Please let me know if I can help > > and how. > > I read through the thread from the top, and it does seem like there is > reasonably strong support for the hard cap. Upon a closer review of the > patch, I noticed that the relopt was defined such that you couldn't disable > autovacuum_max_threshold on a per-table basis, so I fixed that in v4. > To be frank, this patch feels like a solution in search of a problem, and as I read back through the thread, it isn't clear what problem this is intended to fix. There is some talk of "simplifying" autovacuum configuration, but some noted that we already have a rather complex set of GUCs to deal with, and adding another one, along with more math, into the equation doesn't seem simpler to mel I'd like to think the bar should be that the problem should be clear. So what is the problem? Is the patch supposed to help with wraparound prevention? autovac_freeze_max_age already covers that, and when it doesn't vacuum_failsafe_age helps out. A couple of people mentioned issues around hitting the index wall when vacuuming large tables, but we believe that problem is mostly resolved due to radix based tid storage, so this doesn't solve that. (To the degree you don't think v17 has baked into enough production workloads to be sure, I'd agree, but that's also an argument against doing more work that might not be needed) Maybe the hope is that this setting will cause vacuum to run more often to help ameliorate i/o work from freeze vacuums kicking in, but I suspect that Melanie's nearby work on eager vacuuming is a smarter solution towards this problem (warning, it also may want to add more gucs), so I think we're not solving that, and in fact might be undercutting it. I guess that means this is supposed to help with bloat management? but only on large tables? I guess because you run vacuums more often? Except that the adages of running vacuums more often don't apply as cleanly to large tables, because those tables typically come with large indexes, and while we have a lot of machinery in place to help with repeated scans of the heap, that same machinery doesn't exist for scanning the indexes, which gives you sort of an exponential curve around vacuum times as table size (but actually index size) grows larger. On the upside, this does mean we're less likely to see a 50x boost in vacuums on large tables that some seemed concerned about, but on the downside its because we're probably going to increase the probability of vacuum worker starvation. But getting back to goals, if your goal is to help with bloat management, trying to tie that to a number that doesn't cleanly map to the meta information of the table in question is a poor way to do it. Meaning, to the degree that you are skeptical that vacuuming based on 20% of the rows of a table might not really be 20% of the size of the table, it's certainly going to be a closer map than 100million rows in a n number of tables of unknown (but presumably greater than 500million?) numbers of rows of unknown sizes. And again, we have a means to tackle these bloat cases already; lowering vacuum_scale_factor. This isn't to say the system is perfect; I do think there are some fundamental issues that need addressing, but adding this guc just feels a little less baked than usual. Robert Treat https://xzilla.net
pgsql-hackers by date: