Re: New GUC autovacuum_max_threshold ? - Mailing list pgsql-hackers
From | Michael Banck |
---|---|
Subject | Re: New GUC autovacuum_max_threshold ? |
Date | |
Msg-id | 662b6941.5d0a0220.ea153.4178@mx.google.com Whole thread Raw |
In response to | Re: New GUC autovacuum_max_threshold ? (Laurenz Albe <laurenz.albe@cybertec.at>) |
Responses |
Re: New GUC autovacuum_max_threshold ?
Re: New GUC autovacuum_max_threshold ? |
List | pgsql-hackers |
Hi, On Fri, Apr 26, 2024 at 10:18:00AM +0200, Laurenz Albe wrote: > On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote: > > Le 26/04/2024 à 04:24, Laurenz Albe a écrit : > > > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > > > I believe that the underlying problem here can be summarized in this > > > > way: just because I'm OK with 2MB of bloat in my 10MB table doesn't > > > > mean that I'm OK with 2TB of bloat in my 10TB table. One reason for > > > > this is simply that I can afford to waste 2MB much more easily than I > > > > can afford to waste 2TB -- and that applies both on disk and in > > > > memory. > > > > > > I don't find that convincing. Why are 2TB of wasted space in a 10TB > > > table worse than 2TB of wasted space in 100 tables of 100GB each? > > > > Good point, but another way of summarizing the problem would be that the > > autovacuum_*_scale_factor parameters work well as long as we have a more > > or less evenly distributed access pattern in the table. > > > > Suppose my very large table gets updated only for its 1% most recent > > rows. We probably want to decrease autovacuum_analyze_scale_factor and > > autovacuum_vacuum_scale_factor for this one. > > > > Partitioning would be a good solution, but IMHO postgres should be able > > to handle this case anyway, ideally without per-table configuration. > > I agree that you may well want autovacuum and autoanalyze treat your large > table differently from your small tables. > > But I am reluctant to accept even more autovacuum GUCs. It's not like > we don't have enough of them, rather the opposite. You can slap on more > GUCs to treat more special cases, but we will never reach the goal of > having a default that will make everybody happy. > > I believe that the defaults should work well in moderately sized databases > with moderate usage characteristics. If you have large tables or a high > number of transactions per second, you can be expected to make the effort > and adjust the settings for your case. Adding more GUCs makes life *harder* > for the users who are trying to understand and configure how autovacuum works. Well, I disagree to some degree. I agree that the defaults should work well in moderately sized databases with moderate usage characteristics. But I also think we can do better than telling DBAs to they have to manually fine-tune autovacuum for large tables (and frequenlty implementing by hand what this patch is proposed, namely setting autovacuum_vacuum_scale_factor to 0 and autovacuum_vacuum_threshold to a high number), as this is cumbersome and needs adult supervision that is not always available. Of course, it would be great if we just slap some AI into the autovacuum launcher that figures things out automagically, but I don't think we are there, yet. So this proposal (probably along with a higher default threshold than 500000, but IMO less than what Robert and Nathan suggested) sounds like a stop forward to me. DBAs can set the threshold lower if they want, or maybe we can just turn it off by default if we cannot agree on a sane default, but I think this (using the simplified formula from Nathan) is a good approach that takes some pain away from autovacuum tuning and reserves that for the really difficult cases. Michael
pgsql-hackers by date: