Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Should vacuum process config file reload more often |
Date | |
Msg-id | CA+TgmoYDYHht9zsHye3tywx7oWGMP4s_6AGefHkDsMuGHLzT7w@mail.gmail.com Whole thread Raw |
In response to | Re: Should vacuum process config file reload more often (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Should vacuum process config file reload more often
Re: Should vacuum process config file reload more often Re: Should vacuum process config file reload more often |
List | pgsql-hackers |
On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > Thanks all for the reviews. > > v16 attached. I put it together rather quickly, so there might be a few > spurious whitespaces or similar. There is one rather annoying pgindent > outlier that I have to figure out what to do about as well. > > The remaining functional TODOs that I know of are: > > - Resolve what to do about names of GUC and vacuum variables for cost > limit and cost delay (since it may affect extensions) > > - Figure out what to do about the logging message which accesses dboid > and tableoid (lock/no lock, where to put it, etc) > > - I see several places in docs which reference the balancing algorithm > for autovac workers. I did not read them in great detail, but we may > want to review them to see if any require updates. > > - Consider whether or not the initial two commits should just be > squashed with the third commit > > - Anything else reviewers are still unhappy with I really like having the first couple of patches split out -- it makes them super-easy to understand. A committer can always choose to squash at commit time if they want. I kind of wish the patch set were split up more, for even easier understanding. I don't think that's a thing to get hung up on, but it's an opinion that I have. I strongly agree with the goals of the patch set, as I understand them. Being able to change the config file and SIGHUP the server and have the new values affect running autovacuum workers seems pretty huge. It would make it possible to solve problems that currently can only be solved by using gdb on a production instance, which is not a fun thing to be doing. + /* + * Balance and update limit values for autovacuum workers. We must + * always do this in case the autovacuum launcher or another + * autovacuum worker has recalculated the number of workers across + * which we must balance the limit. This is done by the launcher when + * launching a new worker and by workers before vacuuming each table. + */ I don't quite understand what's going on here. A big reason that I'm worried about this whole issue in the first place is that sometimes there's a vacuum going on a giant table and you can't get it to go fast. You want it to absorb new settings, and to do so quickly. I realize that this is about the number of workers, not the actual cost limit, so that makes what I'm about to say less important. But ... is this often enough? Like, the time before we move onto the next table could be super long. The time before a new worker is launched should be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default settings, so that's not horrible, but I'm kind of struggling to understand the rationale for this particular choice. Maybe it's fine. To be honest, I think that the whole system where we divide the cost limit across the workers is the wrong idea. Does anyone actually like that behavior? This patch probably shouldn't touch that, just in the interest of getting something done that is an improvement over where we are now, but I think this behavior is really counterintuitive. People expect that they can increase autovacuum_max_workers to get more vacuuming done, and actually in most cases that does not work. And if that behavior didn't exist, this patch would also be a whole lot simpler. Again, I don't think this is something we should try to address right now under time pressure, but in the future, I think we should consider ripping this behavior out. + if (autovacuum_vac_cost_limit > 0) + VacuumCostLimit = autovacuum_vac_cost_limit; + else + VacuumCostLimit = vacuum_cost_limit; + + /* Only balance limit if no cost-related storage parameters specified */ + if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)) + return; + Assert(VacuumCostLimit > 0); + + nworkers_for_balance = pg_atomic_read_u32( + &AutoVacuumShmem->av_nworkersForBalance); + + /* There is at least 1 autovac worker (this worker). */ + if (nworkers_for_balance <= 0) + elog(ERROR, "nworkers_for_balance must be > 0"); + + VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1); I think it would be better stylistically to use a temporary variable here and only assign the final value to VacuumCostLimit. Daniel: Are you intending to commit this? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: