Re: Should vacuum process config file reload more often - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Should vacuum process config file reload more often |
Date | |
Msg-id | CAAKRu_ZaVV32vQD2zp00-4Zr0ncusckJspDPzAUGGYPFVuv4Og@mail.gmail.com Whole thread Raw |
In response to | Re: Should vacuum process config file reload more often (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On Fri, Apr 7, 2023 at 9:07 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote: > > > > > > > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote: > > > > > > > > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote: > > > >> > > > >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote: > > > >> > > > >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost > > > >>> limit or cost delay have been changed. If they have, they assert that > > > >>> they don't already hold the AutovacuumLock, take it in shared mode, and > > > >>> do the logging. > > > >> > > > >> Another idea would be to copy the values to local temp variables while holding > > > >> the lock, and release the lock before calling elog() to avoid holding the lock > > > >> over potential IO. > > > > > > > > Good idea. I've done this in attached v19. > > > > Also I looked through the docs and everything still looks correct for > > > > balancing algo. > > > > > > I had another read-through and test-through of this version, and have applied > > > it with some minor changes to comments and whitespace. Thanks for the quick > > > turnaround times on reviews in this thread! > > > > Cool! > > > > Regarding the commit 7d71d3dd08, I have one comment: > > > > + /* Only log updates to cost-related variables */ > > + if (vacuum_cost_delay == original_cost_delay && > > + vacuum_cost_limit == original_cost_limit) > > + return; > > > > IIUC by default, we log not only before starting the vacuum but also > > when changing cost-related variables. Which is good, I think, because > > logging the initial values would also be helpful for investigation. > > However, I think that we don't log the initial vacuum cost values > > depending on the values. For example, if the > > autovacuum_vacuum_cost_delay storage option is set to 0, we don't log > > the initial values. I think that instead of comparing old and new > > values, we can write the log only if > > message_level_is_interesting(DEBUG2) is true. That way, we don't need > > to acquire the lwlock unnecessarily. And the code looks cleaner to me. > > I've attached the patch (use_message_level_is_interesting.patch) > > Thanks for coming up with the case you thought of with storage param for > cost delay = 0. In that case we wouldn't print the message initially and > we should fix that. > > I disagree, however, that we should condition it only on > message_level_is_interesting(). > > Actually, outside of printing initial values when the autovacuum worker > first starts (before vacuuming all tables), I don't think we should log > these values except when they are being updated. Autovacuum workers > could vacuum tons of small tables and having this print out at least > once per table (which I know is how it is on master) would be > distracting. Also, you could be reloading the config to update some > other GUCs and be oblivious to an ongoing autovacuum and get these > messages printed out, which I would also find distracting. > > You will have to stare very hard at the logs to tell if your changes to > vacuum cost delay and limit took effect when you reload config. I think > with our changes to update the values more often, we should take the > opportunity to make this logging more useful by making it happen only > when the values are changed. > > I would be open to elevating the log level to DEBUG1 for logging only > updates and, perhaps, having an option if you set log level to DEBUG2, > for example, to always log these values in VacuumUpdateCosts(). > > I'd even argue that, potentially, having the cost-delay related > parameters printed at the beginning of vacuuming could be interesting to > regular VACUUM as well (even though it doesn't benefit from config > reload while in progress). > > To fix the issue you mentioned and ensure the logging is printed when > autovacuum workers start up before vacuuming tables, we could either > initialize vacuum_cost_delay and vacuum_cost_limit to something invalid > that will always be different than what they are set to in > VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using > these values since they are set to the defaults for VACUUM). Or, we > could duplicate this logging message in do_autovacuum(). > > Finally, one other point about message_level_is_interesting(). I liked > the idea of using it a lot, since log level DEBUG2 will not be the > common case. I thought of it but hesitated because all other users of > message_level_is_interesting() are avoiding some memory allocation or > string copying -- not avoiding take a lock. Making this conditioned on > log level made me a bit uncomfortable. I can't think of a situation when > it would be a problem, but it felt a bit off. > > > Also, while testing the autovacuum delay with relopt > > autovacuum_vacuum_cost_delay = 0, I realized that even if we set > > autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to > > true. wi_dobalance comes from the following expression: > > > > /* > > * If any of the cost delay parameters has been set individually for > > * this table, disable the balancing algorithm. > > */ > > tab->at_dobalance = > > !(avopts && (avopts->vacuum_cost_limit > 0 || > > avopts->vacuum_cost_delay > 0)); > > > > The initial values of both avopts->vacuum_cost_limit and > > avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead > > of "> 0". Otherwise, we include the autovacuum worker working on a > > table whose autovacuum_vacuum_cost_delay is 0 to the balancing > > algorithm. Probably this behavior has existed also on back branches > > but I haven't checked it yet. > > Thank you for catching this. Indeed this exists in master since > 1021bd6a89b which was backpatched. I checked and it is true all the way > back through REL_11_STABLE. > > Definitely seems worth fixing as it kind of defeats the purpose of the > original commit. I wish I had noticed before! > > Your fix has: > !(avopts && (avopts->vacuum_cost_limit >= 0 || > avopts->vacuum_cost_delay >= 0)); > > And though delay is required to be >= 0 > avopts->vacuum_cost_delay >= 0 > > Limit does not. It can just be > 0. > > postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0); > ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit" > DETAIL: Valid values are between "1" and "10000". > > Though >= is also fine, the rest of the code in all versions always > checks if limit > 0 and delay >= 0 since 0 is a valid value for delay > and not for limit. Probably best we keep it consistent (though the whole > thing is quite confusing). I have created an open item for each of these issues on the wiki (one for 16 and one under the section "affects stable branches"). - Melanie
pgsql-hackers by date: