Thread: [MASSMAIL]allow changing autovacuum_max_workers without restarting
I frequently hear about scenarios where users with thousands upon thousands of tables realize that autovacuum is struggling to keep up. When they inevitably go to bump up autovacuum_max_workers, they discover that it requires a server restart (i.e., downtime) to take effect, causing further frustration. For this reason, I think $SUBJECT is a desirable improvement. I spent some time looking for past discussions about this, and I was surprised to not find any, so I thought I'd give it a try. The attached proof-of-concept patch demonstrates what I have in mind. Instead of trying to dynamically change the global process table, etc., I'm proposing that we introduce a new GUC that sets the effective maximum number of autovacuum workers that can be started at any time. This means there would be two GUCs for the number of autovacuum workers: one for the number of slots reserved for autovacuum workers, and another that restricts the number of those slots that can be used. The former would continue to require a restart to change its value, and users would typically want to set it relatively high. The latter could be changed at any time and would allow for raising or lowering the maximum number of active autovacuum workers, up to the limit set by the other parameter. The proof-of-concept patch keeps autovacuum_max_workers as the maximum number of slots to reserve for workers, but I think we should instead rename this parameter to something else and then reintroduce autovacuum_max_workers as the new parameter that can be adjusted without restarting. That way, autovacuum_max_workers continues to work much the same way as in previous versions. There are a couple of weird cases with this approach. One is when the restart-only limit is set lower than the PGC_SIGHUP limit. In that case, I think we should just use the restart-only limit. The other is when there are already N active autovacuum workers and the PGC_SIGHUP parameter is changed to something less than N. For that case, I think we should just block starting additional workers until the number of workers drops below the new parameter's value. I don't think we should kill existing workers, or anything else like that. TBH I've been sitting on this idea for a while now, only because I think it has a slim chance of acceptance, but IMHO this is a simple change that could help many users. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
> I frequently hear about scenarios where users with thousands upon thousands > of tables realize that autovacuum is struggling to keep up. When they > inevitably go to bump up autovacuum_max_workers, they discover that it > requires a server restart (i.e., downtime) to take effect, causing further > frustration. For this reason, I think $SUBJECT is a desirable improvement. > I spent some time looking for past discussions about this, and I was > surprised to not find any, so I thought I'd give it a try. I did not review the patch in detail yet, but +1 to the idea. It's not just thousands of tables that suffer from this. If a user has a few large tables hogging the autovac workers, then other tables don't get the autovac cycles they require. Users are then forced to run manual vacuums, which adds complexity to their operations. > The attached proof-of-concept patch demonstrates what I have in mind. > Instead of trying to dynamically change the global process table, etc., I'm > proposing that we introduce a new GUC that sets the effective maximum > number of autovacuum workers that can be started at any time. max_worker_processes defines a pool of max # of background workers allowed. parallel workers and extensions that spin up background workers all utilize from this pool. Should autovacuum_max_workers be able to utilize from max_worker_processes also? This will allow autovacuum_max_workers to be dynamic while the user only has to deal with an already existing GUC. We may want to increase the default value for max_worker_processes as part of this. Regards, Sami Amazon Web Services (AWS)
On Thu, Apr 11, 2024 at 02:24:18PM +0000, Imseih (AWS), Sami wrote: > max_worker_processes defines a pool of max # of background workers allowed. > parallel workers and extensions that spin up background workers all utilize from > this pool. > > Should autovacuum_max_workers be able to utilize from max_worker_processes also? > > This will allow autovacuum_max_workers to be dynamic while the user only has > to deal with an already existing GUC. We may want to increase the default value > for max_worker_processes as part of this. My concern with this approach is that other background workers could use up all the slots and prevent autovacuum workers from starting, unless of course we reserve autovacuum_max_workers slots for _only_ autovacuum workers. I'm not sure if we want to get these parameters tangled up like this, though... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Apr 11, 2024 at 09:42:40AM -0500, Nathan Bossart wrote: > On Thu, Apr 11, 2024 at 02:24:18PM +0000, Imseih (AWS), Sami wrote: >> max_worker_processes defines a pool of max # of background workers allowed. >> parallel workers and extensions that spin up background workers all utilize from >> this pool. >> >> Should autovacuum_max_workers be able to utilize from max_worker_processes also? >> >> This will allow autovacuum_max_workers to be dynamic while the user only has >> to deal with an already existing GUC. We may want to increase the default value >> for max_worker_processes as part of this. > > My concern with this approach is that other background workers could use up > all the slots and prevent autovacuum workers from starting, unless of > course we reserve autovacuum_max_workers slots for _only_ autovacuum > workers. I'm not sure if we want to get these parameters tangled up like > this, though... I see that the logical replication launcher process uses this pool, but we take special care to make sure it gets a slot: /* * Register the apply launcher. It's probably a good idea to call this * before any modules had a chance to take the background worker slots. */ ApplyLauncherRegister(); I'm not sure there's another way to effectively reserve slots that would work for the autovacuum workers (which need to restart to connect to different databases), so that would need to be invented. We'd probably also want to fail startup if autovacuum_max_workers < max_worker_processes, which seems like it has the potential to cause problems when folks first upgrade to v18. Furthermore, we might have to convert autovacuum workers to background worker processes for this to work. I've admittedly wondered about whether we should do that eventually, anyway, but it'd expand the scope of this work quite a bit. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> My concern with this approach is that other background workers could use up > all the slots and prevent autovacuum workers from starting That's a good point, the current settings do not guarantee that you get a worker for the purpose if none are available, i.e. max_parallel_workers_per_gather, you may have 2 workers planned and 0 launched. > unless of > course we reserve autovacuum_max_workers slots for _only_ autovacuum > workers. I'm not sure if we want to get these parameters tangled up like > this, though... This will be confusing to describe and we will be reserving autovac workers implicitly, rather than explicitly with a new GUC. Regards, Sami
On Thu, Apr 11, 2024 at 03:37:23PM +0000, Imseih (AWS), Sami wrote: >> My concern with this approach is that other background workers could use up >> all the slots and prevent autovacuum workers from starting > > That's a good point, the current settings do not guarantee that you > get a worker for the purpose if none are available, > i.e. max_parallel_workers_per_gather, you may have 2 workers planned > and 0 launched. > >> unless of >> course we reserve autovacuum_max_workers slots for _only_ autovacuum >> workers. I'm not sure if we want to get these parameters tangled up like >> this, though... > > This will be confusing to describe and we will be reserving autovac workers > implicitly, rather than explicitly with a new GUC. Yeah, that's probably a good reason to give autovacuum its own worker pool. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
I spent sometime reviewing/testing the POC. It is relatively simple with a lot of obvious value. I tested with 16 tables that constantly reach the autovac threashold and the patch did the right thing. I observed concurrent autovacuum workers matching the setting as I was adjusting it dynamically. As you mention above, If there are more autovacs in progress and a new lower setting is applied, we should not take any special action on those autovacuums, and eventually the max number of autovacuum workers will match the setting. I also tested by allowing user connections to reach max_connections, and observed the expected number of autovacuums spinning up and correctly adjusted. Having autovacuum tests ( unless I missed something ) like the above is a good general improvement, but it should not be tied to this. A few comments on the POC patch: 1/ We should emit a log when autovacuum_workers is set higher than the max. 2/ should the name of the restart limit be "reserved_autovacuum_workers"? Regards, Sami Imseih AWS (Amazon Web Services)
On Fri, Apr 12, 2024 at 05:27:40PM +0000, Imseih (AWS), Sami wrote: > A few comments on the POC patch: Thanks for reviewing. > 1/ We should emit a log when autovacuum_workers is set higher than the max. Hm. Maybe the autovacuum launcher could do that. > 2/ should the name of the restart limit be "reserved_autovacuum_workers"? That's kind-of what I had in mind, although I think we might want to avoid the word "reserved" because it sounds a bit like reserved_connections and superuser_reserved_connections. "autovacuum_max_slots" or "autovacuum_max_worker_slots" might be worth considering, too. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>> 1/ We should emit a log when autovacuum_workers is set higher than the max. >> Hm. Maybe the autovacuum launcher could do that. Would it be better to use a GUC check_hook that compares the new value with the max allowed values and emits a WARNING ? autovacuum_max_workers already has a check_autovacuum_max_workers check_hook, which can be repurposed for this. In the POC patch, this check_hook is kept as-is, which will no longer make sense. >> 2/ should the name of the restart limit be "reserved_autovacuum_workers"? >> That's kind-of what I had in mind, although I think we might want to avoid >> the word "reserved" because it sounds a bit like reserved_connections >> and superuser_reserved_connections Yes, I agree. This can be confusing. >> "autovacuum_max_slots" or >> "autovacuum_max_worker_slots" might be worth considering, too. "autovacuum_max_worker_slots" is probably the best option because we should have "worker" in the name of the GUC. Regards, Sami
On Fri, Apr 12, 2024 at 10:17:44PM +0000, Imseih (AWS), Sami wrote: >>> Hm. Maybe the autovacuum launcher could do that. > > Would it be better to use a GUC check_hook that compares the > new value with the max allowed values and emits a WARNING ? > > autovacuum_max_workers already has a check_autovacuum_max_workers > check_hook, which can be repurposed for this. > > In the POC patch, this check_hook is kept as-is, which will no longer make sense. IIRC using GUC hooks to handle dependencies like this is generally frowned upon because it tends to not work very well [0]. We could probably get it to work for this particular case, but IMHO we should still try to avoid this approach. I didn't find any similar warnings for other GUCs like max_parallel_workers_per_gather, so it might not be crucial to emit a WARNING here. [0] https://postgr.es/m/27574.1581015893%40sss.pgh.pa.us -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> IIRC using GUC hooks to handle dependencies like this is generally frowned > upon because it tends to not work very well [0]. We could probably get it > to work for this particular case, but IMHO we should still try to avoid > this approach. Thanks for pointing this out. I agree, this could lead to false logs being emitted. > so it might not be crucial to emit a > WARNING here. As mentioned earlier in the thread, we can let the autovacuum launcher emit the log, but it will need to be careful not flood the logs when this condition exists ( i.e. log only the first time the condition is detected or log every once in a while ) The additional complexity is not worth it. Regards, Sami
Here is a first attempt at a proper patch set based on the discussion thus far. I've split it up into several small patches for ease of review, which is probably a bit excessive. If this ever makes it to commit, they could likely be combined. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote: > The attached proof-of-concept patch demonstrates what I have in mind. > Instead of trying to dynamically change the global process table, etc., I'm > proposing that we introduce a new GUC that sets the effective maximum > number of autovacuum workers that can be started at any time. This means > there would be two GUCs for the number of autovacuum workers: one for the > number of slots reserved for autovacuum workers, and another that restricts > the number of those slots that can be used. The former would continue to > require a restart to change its value, and users would typically want to > set it relatively high. The latter could be changed at any time and would > allow for raising or lowering the maximum number of active autovacuum > workers, up to the limit set by the other parameter. > > The proof-of-concept patch keeps autovacuum_max_workers as the maximum > number of slots to reserve for workers, but I think we should instead > rename this parameter to something else and then reintroduce > autovacuum_max_workers as the new parameter that can be adjusted without > restarting. That way, autovacuum_max_workers continues to work much the > same way as in previous versions. When I thought about this, I considered proposing to add a new GUC for "autovacuum_policy_workers". autovacuum_max_workers would be the same as before, requiring a restart to change. The policy GUC would be the soft limit, changable at runtime up to the hard limit of autovacuum_max_workers (or maybe any policy value exceeding autovacuum_max_workers would be ignored). We'd probably change autovacuum_max_workers to default to a higher value (8, or 32 as in your patch), and have autovacuum_max_workers default to 3, for consistency with historic behavior. Maybe autovacuum_policy_workers=-1 would mean to use all workers. There's the existing idea to change autovacuum thresholds during the busy period of the day vs. off hours. This would allow something similar with nworkers rather than thresholds: if the goal were to reduce the resource use of vacuum, the admin could set max_workers=8, with policy_workers=2 during the busy period. -- Justin
On Mon, Apr 15, 2024 at 08:33:33AM -0500, Justin Pryzby wrote: > On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote: >> The proof-of-concept patch keeps autovacuum_max_workers as the maximum >> number of slots to reserve for workers, but I think we should instead >> rename this parameter to something else and then reintroduce >> autovacuum_max_workers as the new parameter that can be adjusted without >> restarting. That way, autovacuum_max_workers continues to work much the >> same way as in previous versions. > > When I thought about this, I considered proposing to add a new GUC for > "autovacuum_policy_workers". > > autovacuum_max_workers would be the same as before, requiring a restart > to change. The policy GUC would be the soft limit, changable at runtime > up to the hard limit of autovacuum_max_workers (or maybe any policy > value exceeding autovacuum_max_workers would be ignored). > > We'd probably change autovacuum_max_workers to default to a higher value > (8, or 32 as in your patch), and have autovacuum_max_workers default to > 3, for consistency with historic behavior. Maybe > autovacuum_policy_workers=-1 would mean to use all workers. This sounds like roughly the same idea, although it is backwards from what I'm proposing in the v1 patch set. My thinking is that by making a new restart-only GUC that would by default be set higher than the vast majority of systems should ever need, we could simplify migrating to these parameters. The autovacuum_max_workers parameter would effectively retain it's original meaning, and existing settings would continue to work normally on v18, but users could now adjust it without restarting. If we did it the other way, users would need to bump up autovacuum_max_workers and restart prior to being able to raise autovacuum_policy_workers beyond what they previously had set for autovacuum_max_workers. That being said, I'm open to doing it this way if folks prefer this approach, as I think it is still an improvement. > There's the existing idea to change autovacuum thresholds during the > busy period of the day vs. off hours. This would allow something > similar with nworkers rather than thresholds: if the goal were to reduce > the resource use of vacuum, the admin could set max_workers=8, with > policy_workers=2 during the busy period. Precisely. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Apr 15, 2024 at 11:28:33AM -0500, Nathan Bossart wrote: > On Mon, Apr 15, 2024 at 08:33:33AM -0500, Justin Pryzby wrote: >> On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote: >>> The proof-of-concept patch keeps autovacuum_max_workers as the maximum >>> number of slots to reserve for workers, but I think we should instead >>> rename this parameter to something else and then reintroduce >>> autovacuum_max_workers as the new parameter that can be adjusted without >>> restarting. That way, autovacuum_max_workers continues to work much the >>> same way as in previous versions. >> >> When I thought about this, I considered proposing to add a new GUC for >> "autovacuum_policy_workers". >> >> autovacuum_max_workers would be the same as before, requiring a restart >> to change. The policy GUC would be the soft limit, changable at runtime >> up to the hard limit of autovacuum_max_workers (or maybe any policy >> value exceeding autovacuum_max_workers would be ignored). >> >> We'd probably change autovacuum_max_workers to default to a higher value >> (8, or 32 as in your patch), and have autovacuum_max_workers default to >> 3, for consistency with historic behavior. Maybe >> autovacuum_policy_workers=-1 would mean to use all workers. > > This sounds like roughly the same idea, although it is backwards from what > I'm proposing in the v1 patch set. My thinking is that by making a new > restart-only GUC that would by default be set higher than the vast majority > of systems should ever need, we could simplify migrating to these > parameters. The autovacuum_max_workers parameter would effectively retain > it's original meaning, and existing settings would continue to work > normally on v18, but users could now adjust it without restarting. If we > did it the other way, users would need to bump up autovacuum_max_workers > and restart prior to being able to raise autovacuum_policy_workers beyond > what they previously had set for autovacuum_max_workers. That being said, > I'm open to doing it this way if folks prefer this approach, as I think it > is still an improvement. Another option could be to just remove the restart-only GUC and hard-code the upper limit of autovacuum_max_workers to 64 or 128 or something. While that would simplify matters, I suspect it would be hard to choose an appropriate limit that won't quickly become outdated. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> Another option could be to just remove the restart-only GUC and hard-code > the upper limit of autovacuum_max_workers to 64 or 128 or something. While > that would simplify matters, I suspect it would be hard to choose an > appropriate limit that won't quickly become outdated. Hardcoded values are usually hard to deal with because they are hidden either In code or in docs. > When I thought about this, I considered proposing to add a new GUC for > "autovacuum_policy_workers". > autovacuum_max_workers would be the same as before, requiring a restart > to change. The policy GUC would be the soft limit, changable at runtime I think autovacuum_max_workers should still be the GUC that controls the number of concurrent autovacuums. This parameter is already well established and changing the meaning now will be confusing. I suspect most users will be glad it's now dynamic, but will probably be annoyed if it's no longer doing what it's supposed to. Regards, Sami
Agree +1,From a dba perspective, I would prefer that this parameter can be dynamically modified, rather than adding a new parameter,What is more difficult is how to smoothly reach the target value when the setting is considered to be too large and needs to be lowered.
Regards
Regards
On Tue, 16 Apr 2024 at 01:41, Imseih (AWS), Sami <simseih@amazon.com> wrote:
> Another option could be to just remove the restart-only GUC and hard-code
> the upper limit of autovacuum_max_workers to 64 or 128 or something. While
> that would simplify matters, I suspect it would be hard to choose an
> appropriate limit that won't quickly become outdated.
Hardcoded values are usually hard to deal with because they are hidden either
In code or in docs.
> When I thought about this, I considered proposing to add a new GUC for
> "autovacuum_policy_workers".
> autovacuum_max_workers would be the same as before, requiring a restart
> to change. The policy GUC would be the soft limit, changable at runtime
I think autovacuum_max_workers should still be the GUC that controls
the number of concurrent autovacuums. This parameter is already well
established and changing the meaning now will be confusing.
I suspect most users will be glad it's now dynamic, but will probably
be annoyed if it's no longer doing what it's supposed to.
Regards,
Sami
> Here is a first attempt at a proper patch set based on the discussion thus > far. I've split it up into several small patches for ease of review, which > is probably a bit excessive. If this ever makes it to commit, they could > likely be combined. I looked at the patch set. With the help of DEBUG2 output, I tested to ensure that the the autovacuum_cost_limit balance adjusts correctly when the autovacuum_max_workers value increases/decreases. I did not think the patch will break this behavior, but it's important to verify this. Some comments on the patch: 1. A nit. There should be a tab here. - dlist_head av_freeWorkers; + dclist_head av_freeWorkers; 2. autovacuum_max_worker_slots documentation: + <para> + Note that the value of <xref linkend="guc-autovacuum-max-workers"/> is + silently capped to this value. + </para> This comment looks redundant in the docs, since the entry for autovacuum_max_workers that follows mentions the same. 3. The docs for autovacuum_max_workers should mention that when the value changes, consider adjusting the autovacuum_cost_limit/cost_delay. This is not something new. Even in the current state, users should think about these settings. However, it seems even important if this value is to be dynamically adjusted. Regards, Sami
On Thu, Apr 18, 2024 at 05:05:03AM +0000, Imseih (AWS), Sami wrote: > I looked at the patch set. With the help of DEBUG2 output, I tested to ensure > that the the autovacuum_cost_limit balance adjusts correctly when the > autovacuum_max_workers value increases/decreases. I did not think the > patch will break this behavior, but it's important to verify this. Great. > 1. A nit. There should be a tab here. > > - dlist_head av_freeWorkers; > + dclist_head av_freeWorkers; I dare not argue with pgindent. > 2. autovacuum_max_worker_slots documentation: > > + <para> > + Note that the value of <xref linkend="guc-autovacuum-max-workers"/> is > + silently capped to this value. > + </para> > > This comment looks redundant in the docs, since the entry > for autovacuum_max_workers that follows mentions the > same. Removed in v2. I also noticed that I forgot to update the part about when autovacuum_max_workers can be changed. *facepalm* > 3. The docs for autovacuum_max_workers should mention that when > the value changes, consider adjusting the autovacuum_cost_limit/cost_delay. > > This is not something new. Even in the current state, users should think about > these settings. However, it seems even important if this value is to be > dynamically adjusted. I don't necessarily disagree that it might be worth mentioning these parameters, but I would argue that this should be proposed in a separate thread. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Apr 19, 2024 at 11:43 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > Removed in v2. I also noticed that I forgot to update the part about when > autovacuum_max_workers can be changed. *facepalm* I think this could help a bunch of users, but I'd still like to complain, not so much with the desire to kill this patch as with the desire to broaden the conversation. Part of the underlying problem here is that, AFAIK, neither PostgreSQL as a piece of software nor we as human beings who operate PostgreSQL databases have much understanding of how autovacuum_max_workers should be set. It's relatively easy to hose yourself by raising autovacuum_max_workers to try to make things go faster, but produce the exact opposite effect due to how the cost balancing stuff works. But, even if you have the correct use case for autovacuum_max_workers, something like a few large tables that take a long time to vacuum plus a bunch of smaller ones that can't get starved just because the big tables are in the midst of being processed, you might well ask yourself why it's your job to figure out the correct number of workers. Now, before this patch, there is a fairly good reason for that, which is that we need to reserve shared memory resources for each autovacuum worker that might potentially run, and the system can't know how much shared memory you'd like to reserve for that purpose. But if that were the only problem, then this patch would probably just be proposing to crank up the default value of that parameter rather than introducing a second one. I bet Nathan isn't proposing that because his intuition is that it will work out badly, and I think he's right. I bet that cranking up the number of allowed workers will often result in running more workers than we really should. One possible negative consequence is that we'll end up with multiple processes fighting over the disk in a situation where they should just take turns. I suspect there are also ways that we can be harmed - in broadly similar fashion - by cost balancing. So I feel like what this proposal reveals is that we know that our algorithm for ramping up the number of running workers doesn't really work. And maybe that's just a consequence of the general problem that we have no global information about how much vacuuming work there is to be done at any given time, and therefore we cannot take any kind of sensible guess about whether 1 more worker will help or hurt. Or, maybe there's some way to do better than what we do today without a big rewrite. I'm not sure. I don't think this patch should be burdened with solving the general problem here. But I do think the general problem is worth some discussion. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Apr 19, 2024 at 02:42:13PM -0400, Robert Haas wrote: > I think this could help a bunch of users, but I'd still like to > complain, not so much with the desire to kill this patch as with the > desire to broaden the conversation. I think I subconsciously hoped this would spark a bigger discussion... > Now, before this patch, there is a fairly good reason for that, which > is that we need to reserve shared memory resources for each autovacuum > worker that might potentially run, and the system can't know how much > shared memory you'd like to reserve for that purpose. But if that were > the only problem, then this patch would probably just be proposing to > crank up the default value of that parameter rather than introducing a > second one. I bet Nathan isn't proposing that because his intuition is > that it will work out badly, and I think he's right. I bet that > cranking up the number of allowed workers will often result in running > more workers than we really should. One possible negative consequence > is that we'll end up with multiple processes fighting over the disk in > a situation where they should just take turns. I suspect there are > also ways that we can be harmed - in broadly similar fashion - by cost > balancing. Even if we were content to bump up the default value of autovacuum_max_workers and tell folks to just mess with the cost settings, there are still probably many cases where bumping up the number of workers further would be necessary. If you have a zillion tables, turning cost-based vacuuming off completely may be insufficient to keep up, at which point your options become limited. It can be difficult to tell whether you might end up in this situation over time as your workload evolves. In any case, it's not clear to me that bumping up the default value of autovacuum_max_workers would do more good than harm. I get the idea that the default of 3 is sufficient for a lot of clusters, so there'd really be little upside to changing it AFAICT. (I guess this proves your point about my intuition.) > So I feel like what this proposal reveals is that we know that our > algorithm for ramping up the number of running workers doesn't really > work. And maybe that's just a consequence of the general problem that > we have no global information about how much vacuuming work there is > to be done at any given time, and therefore we cannot take any kind of > sensible guess about whether 1 more worker will help or hurt. Or, > maybe there's some way to do better than what we do today without a > big rewrite. I'm not sure. I don't think this patch should be burdened > with solving the general problem here. But I do think the general > problem is worth some discussion. I certainly don't want to hold up $SUBJECT for a larger rewrite of autovacuum scheduling, but I also don't want to shy away from a larger rewrite if it's an idea whose time has come. I'm looking forward to hearing your ideas in your pgconf.dev talk. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> Part of the underlying problem here is that, AFAIK, neither PostgreSQL > as a piece of software nor we as human beings who operate PostgreSQL > databases have much understanding of how autovacuum_max_workers should > be set. It's relatively easy to hose yourself by raising > autovacuum_max_workers to try to make things go faster, but produce > the exact opposite effect due to how the cost balancing stuff works. Yeah, this patch will not fix this problem. Anyone who raises av_max_workers should think about adjusting the av_cost_delay. This was discussed up the thread [4] and even without this patch, I think it's necessary to add more documentation on the relationship between workers and cost. > So I feel like what this proposal reveals is that we know that our > algorithm for ramping up the number of running workers doesn't really > work. And maybe that's just a consequence of the general problem that > we have no global information about how much vacuuming work there is > to be done at any given time, and therefore we cannot take any kind of > sensible guess about whether 1 more worker will help or hurt. Or, > maybe there's some way to do better than what we do today without a > big rewrite. I'm not sure. I don't think this patch should be burdened > with solving the general problem here. But I do think the general > problem is worth some discussion. This patch is only solving the operational problem of adjusting autovacuum_max_workers, and it does so without introducing complexity. A proposal that will alleviate the users from the burden of having to think about autovacuum_max_workers, cost_delay and cost_limit settings will be great. This patch may be the basis for such dynamic "auto-tuning" of autovacuum workers. Regards, Sami [4] https://www.postgresql.org/message-id/20240419154322.GA3988554%40nathanxps13
On Fri, Apr 19, 2024 at 4:29 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > I certainly don't want to hold up $SUBJECT for a larger rewrite of > autovacuum scheduling, but I also don't want to shy away from a larger > rewrite if it's an idea whose time has come. I'm looking forward to > hearing your ideas in your pgconf.dev talk. Yeah, I suppose I was hoping you were going to tell me the all the answers and thus make the talk a lot easier to write, but I guess life isn't that simple. :-) -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Apr 15, 2024 at 05:41:04PM +0000, Imseih (AWS), Sami wrote: >> Another option could be to just remove the restart-only GUC and hard-code >> the upper limit of autovacuum_max_workers to 64 or 128 or something. While >> that would simplify matters, I suspect it would be hard to choose an >> appropriate limit that won't quickly become outdated. > > Hardcoded values are usually hard to deal with because they are hidden either > In code or in docs. That's true, but using a hard-coded limit means we no longer need to add a new GUC. Always allocating, say, 256 slots might require a few additional kilobytes of shared memory, most of which will go unused, but that seems unlikely to be a problem for the systems that will run Postgres v18. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
> That's true, but using a hard-coded limit means we no longer need to add a > new GUC. Always allocating, say, 256 slots might require a few additional > kilobytes of shared memory, most of which will go unused, but that seems > unlikely to be a problem for the systems that will run Postgres v18. I agree with this. Regards, Sami
On Fri, May 03, 2024 at 12:57:18PM +0000, Imseih (AWS), Sami wrote: >> That's true, but using a hard-coded limit means we no longer need to add a >> new GUC. Always allocating, say, 256 slots might require a few additional >> kilobytes of shared memory, most of which will go unused, but that seems >> unlikely to be a problem for the systems that will run Postgres v18. > > I agree with this. Here's what this might look like. I chose an upper limit of 1024, which seems like it "ought to be enough for anybody," at least for now. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
>>> That's true, but using a hard-coded limit means we no longer need to add a >>> new GUC. Always allocating, say, 256 slots might require a few additional >>> kilobytes of shared memory, most of which will go unused, but that seems >>> unlikely to be a problem for the systems that will run Postgres v18. >> >> I agree with this. > Here's what this might look like. I chose an upper limit of 1024, which > seems like it "ought to be enough for anybody," at least for now. I thought 256 was a good enough limit. In practice, I doubt anyone will benefit from more than a few dozen autovacuum workers. I think 1024 is way too high to even allow. Besides that the overall patch looks good to me, but I have some comments on the documentation. I don't think combining 1024 + 5 = 1029 is a good idea in docs. Breaking down the allotment and using the name of the constant is much more clear. I suggest " max_connections + max_wal_senders + max_worker_processes + AUTOVAC_MAX_WORKER_SLOTS + 5" and in other places in the docs, we should mention the actual value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the below section? Instead of: - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background + (1024) and allowed background do something like: - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background + AUTOVAC_MAX_WORKER_SLOTS (1024) and allowed background Also, replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS. + <varname>max_wal_senders</varname>, + plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum + worker processes, plus one extra for each 16 Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs seems wrong. If it refers to NUM_AUXILIARY_PROCS defined in include/storage/proc.h, it should a "6" #define NUM_AUXILIARY_PROCS 6 This is not a consequence of this patch, and can be dealt with In a separate thread if my understanding is correct. Regards, Sami
On Thu, May 16, 2024 at 04:37:10PM +0000, Imseih (AWS), Sami wrote: > I thought 256 was a good enough limit. In practice, I doubt anyone will > benefit from more than a few dozen autovacuum workers. > I think 1024 is way too high to even allow. WFM > I don't think combining 1024 + 5 = 1029 is a good idea in docs. > Breaking down the allotment and using the name of the constant > is much more clear. > > I suggest > " max_connections + max_wal_senders + max_worker_processes + AUTOVAC_MAX_WORKER_SLOTS + 5" > > and in other places in the docs, we should mention the actual > value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the > below section? > > Instead of: > - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background > + (1024) and allowed background > > do something like: > - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background > + AUTOVAC_MAX_WORKER_SLOTS (1024) and allowed background > > Also, replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS. > > + <varname>max_wal_senders</varname>, > + plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum > + worker processes, plus one extra for each 16 Part of me wonders whether documenting the exact formula is worthwhile. This portion of the docs is rather complicated, and I can't recall ever having to do the arithmetic is describes. Plus, see below... > Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs > seems wrong. > > If it refers to NUM_AUXILIARY_PROCS defined in > include/storage/proc.h, it should a "6" > > #define NUM_AUXILIARY_PROCS 6 > > This is not a consequence of this patch, and can be dealt with > In a separate thread if my understanding is correct. Ha, I think it should actually be "+ 7"! The value is calculated as MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders + 6 Looking at the history, this documentation tends to be wrong quite often. In v9.2, the checkpointer was introduced, and these formulas were not updated. In v9.3, background worker processes were introduced, and the formulas were still not updated. Finally, in v9.6, it was fixed in commit 597f7e3. Then, in v14, the archiver process was made an auxiliary process (commit d75288f), making the formulas out-of-date again. And in v17, the WAL summarizer was added. On top of this, IIUC you actually need even more semaphores if your system doesn't support atomics, and from a quick skim this doesn't seem to be covered in this documentation. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, May 16, 2024 at 09:16:46PM -0500, Nathan Bossart wrote: > On Thu, May 16, 2024 at 04:37:10PM +0000, Imseih (AWS), Sami wrote: >> I thought 256 was a good enough limit. In practice, I doubt anyone will >> benefit from more than a few dozen autovacuum workers. >> I think 1024 is way too high to even allow. > > WFM Here is an updated patch that uses 256 as the upper limit. >> I don't think combining 1024 + 5 = 1029 is a good idea in docs. >> Breaking down the allotment and using the name of the constant >> is much more clear. I plan to further improve this section of the documentation in v18, so I've left the constant unexplained for now. -- nathan
Attachment
Hi, On 2024-06-03 13:52:29 -0500, Nathan Bossart wrote: > Here is an updated patch that uses 256 as the upper limit. I don't have time to read through the entire thread right now - it'd be good for the commit message of a patch like this to include justification for why it's ok to make such a change. Even before actually committing it, so reviewers have an easier time catching up. Why do we think that increasing the number of PGPROC slots, heavyweight locks etc by 256 isn't going to cause issues? That's not an insubstantial amount of memory to dedicate to something that will practically never be used. ISTM that at the very least we ought to exclude the reserved slots from the computation of things like the number of locks resulting from max_locks_per_transaction. It's very common to increase max_locks_per_transaction substantially, adding ~250 to the multiplier can be a good amount of memory. And AV workers should never need a meaningful number. Increasing e.g. the size of the heavyweight lock table has consequences besides the increase in memory usage, the size increase can make it less likely for the table to fit largely into L3, thus decreasing performance. Greetings, Andres Freund
On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote: > I don't have time to read through the entire thread right now - it'd be good > for the commit message of a patch like this to include justification for why > it's ok to make such a change. Even before actually committing it, so > reviewers have an easier time catching up. Sorry about that. I think the main question (besides "should we do this?") is whether we ought to make the upper limit configurable. My initial idea was to split autovacuum_max_workers into two GUCs: one for the upper limit that only be changed at server start and another for the effective limit that can be changed up to the upper limit without restarting the server. If we can just set a sufficiently high upper limit and avoid the extra GUC without causing problems, that might be preferable, but I sense that you are about to tell me that it will indeed cause problems. :) > Why do we think that increasing the number of PGPROC slots, heavyweight locks > etc by 256 isn't going to cause issues? That's not an insubstantial amount of > memory to dedicate to something that will practically never be used. I personally have not observed problems with these kinds of bumps in resource usage, although I may be biased towards larger systems where it doesn't matter as much. > ISTM that at the very least we ought to exclude the reserved slots from the > computation of things like the number of locks resulting from > max_locks_per_transaction. It's very common to increase > max_locks_per_transaction substantially, adding ~250 to the multiplier can be > a good amount of memory. And AV workers should never need a meaningful number. This is an interesting idea. > Increasing e.g. the size of the heavyweight lock table has consequences > besides the increase in memory usage, the size increase can make it less > likely for the table to fit largely into L3, thus decreasing performance. IMHO this might be a good argument for making the upper limit configurable and setting it relatively low by default. That's not quite as nice from a user experience perspective, but weird, hard-to-diagnose performance issues are certainly not nice, either. -- nathan
Hi, On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote: > On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote: > > Why do we think that increasing the number of PGPROC slots, heavyweight locks > > etc by 256 isn't going to cause issues? That's not an insubstantial amount of > > memory to dedicate to something that will practically never be used. > > I personally have not observed problems with these kinds of bumps in > resource usage, although I may be biased towards larger systems where it > doesn't matter as much. IME it matters *more* on larger systems. Or at least used to, I haven't experimented with this in quite a while. It's possible that we improved a bunch of things sufficiently for this to not matter anymore. Greetings, Andres Freund
On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote: > On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote: >> On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote: >> > Why do we think that increasing the number of PGPROC slots, heavyweight locks >> > etc by 256 isn't going to cause issues? That's not an insubstantial amount of >> > memory to dedicate to something that will practically never be used. >> >> I personally have not observed problems with these kinds of bumps in >> resource usage, although I may be biased towards larger systems where it >> doesn't matter as much. > > IME it matters *more* on larger systems. Or at least used to, I haven't > experimented with this in quite a while. > > It's possible that we improved a bunch of things sufficiently for this to not > matter anymore. I'm curious if there is something specific you would look into to verify this. IIUC one concern is the lock table not fitting into L3. Is there anything else? Any particular workloads you have in mind? -- nathan
Hi, On 2024-06-18 14:00:00 -0500, Nathan Bossart wrote: > On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote: > > On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote: > >> On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote: > >> > Why do we think that increasing the number of PGPROC slots, heavyweight locks > >> > etc by 256 isn't going to cause issues? That's not an insubstantial amount of > >> > memory to dedicate to something that will practically never be used. > >> > >> I personally have not observed problems with these kinds of bumps in > >> resource usage, although I may be biased towards larger systems where it > >> doesn't matter as much. > > > > IME it matters *more* on larger systems. Or at least used to, I haven't > > experimented with this in quite a while. > > > > It's possible that we improved a bunch of things sufficiently for this to not > > matter anymore. > > I'm curious if there is something specific you would look into to verify > this. IIUC one concern is the lock table not fitting into L3. Is there > anything else? Any particular workloads you have in mind? That was the main thing I was thinking of. But I think I just thought of one more: It's going to *substantially* increase the resource usage for tap tests. Right now Cluster.pm has # conservative settings to ensure we can run multiple postmasters: print $conf "shared_buffers = 1MB\n"; print $conf "max_connections = 10\n"; for nodes that allow streaming. Adding 256 extra backend slots increases the shared memory usage from ~5MB to ~18MB. I just don't see much point in reserving 256 worker "possibilities", tbh. I can't think of any practical system where it makes sense to use this much (nor do I think it's going to be reasonable in the next 10 years) and it's just going to waste memory and startup time for everyone. Nor does it make sense to me to have the max autovac workers be independent of max_connections. Greetings, Andres Freund
On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote: > I just don't see much point in reserving 256 worker "possibilities", tbh. I > can't think of any practical system where it makes sense to use this much (nor > do I think it's going to be reasonable in the next 10 years) and it's just > going to waste memory and startup time for everyone. Given this, here are some options I see for moving this forward: * lower the cap to, say, 64 or 32 * exclude autovacuum worker slots from computing number of locks, etc. * make the cap configurable and default it to something low (e.g., 8) My intent with a reserved set of 256 slots was to prevent users from needing to deal with two GUCs. For all practical purposes, it would be possible to change autovacuum_max_workers whenever you want. But if the extra resource requirements are too much of a tax, I'm content to change course. -- nathan
Hi, On 2024-06-18 16:09:09 -0500, Nathan Bossart wrote: > On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote: > > I just don't see much point in reserving 256 worker "possibilities", tbh. I > > can't think of any practical system where it makes sense to use this much (nor > > do I think it's going to be reasonable in the next 10 years) and it's just > > going to waste memory and startup time for everyone. > > Given this, here are some options I see for moving this forward: > > * lower the cap to, say, 64 or 32 > * exclude autovacuum worker slots from computing number of locks, etc. That seems good regardless > * make the cap configurable and default it to something low (e.g., 8) Another one: Have a general cap of 64, but additionally limit it to something like max(1, min(WORKER_CAP, max_connections / 4)) so that cases like tap tests don't end up allocating vastly more worker slots than actual connection slots. > My intent with a reserved set of 256 slots was to prevent users from > needing to deal with two GUCs. For all practical purposes, it would be > possible to change autovacuum_max_workers whenever you want. But if the > extra resource requirements are too much of a tax, I'm content to change > course. Approximately tripling shared memory usage for tap test instances does seem too much to me. Greetings, Andres Freund
On Tue, Jun 18, 2024 at 02:33:31PM -0700, Andres Freund wrote: > Another one: > > Have a general cap of 64, but additionally limit it to something like > max(1, min(WORKER_CAP, max_connections / 4)) > > so that cases like tap tests don't end up allocating vastly more worker slots > than actual connection slots. That's a clever idea. My only concern would be that we are tethering two parameters that aren't super closely related, but I'm unsure whether it would cause any problems in practice. -- nathan
On Tue, Jun 18, 2024 at 07:43:36PM -0500, Nathan Bossart wrote: > On Tue, Jun 18, 2024 at 02:33:31PM -0700, Andres Freund wrote: >> Another one: >> >> Have a general cap of 64, but additionally limit it to something like >> max(1, min(WORKER_CAP, max_connections / 4)) >> >> so that cases like tap tests don't end up allocating vastly more worker slots >> than actual connection slots. > > That's a clever idea. My only concern would be that we are tethering two > parameters that aren't super closely related, but I'm unsure whether it > would cause any problems in practice. Here is an attempt at doing this. I've added 0001 [0] and 0002 [1] as prerequisite patches, which helps simplify 0003 a bit. It probably doesn't work correctly for EXEC_BACKEND builds yet. I'm still not sure about this approach. At the moment, I'm leaning towards something more like v2 [2] where the upper limit is a PGC_POSTMASTER GUC (that we would set very low for TAP tests). [0] https://commitfest.postgresql.org/48/4998/ [1] https://commitfest.postgresql.org/48/5059/ [2] https://postgr.es/m/20240419154322.GA3988554%40nathanxps13 -- nathan
Attachment
On Fri, Jun 21, 2024 at 03:44:07PM -0500, Nathan Bossart wrote: > I'm still not sure about this approach. At the moment, I'm leaning towards > something more like v2 [2] where the upper limit is a PGC_POSTMASTER GUC > (that we would set very low for TAP tests). Like so. -- nathan
Attachment
Here is a rebased patch. One thing that still bugs me is that there is no feedback sent to the user when autovacuum_max_workers is set higher than autovacuum_worker_slots. I think we should at least emit a WARNING, perhaps from the autovacuum launcher, i.e., once when the launcher starts and then again as needed via HandleAutoVacLauncherInterrupts(). Or we could fail to start in PostmasterMain() and then ignore later misconfigurations via a GUC check hook. I'm not too thrilled about adding more GUC check hooks that depend on the value of other GUCs, but I do like the idea of failing instead of silently proceeding with a different value than the user configured. Any thoughts? -- nathan
Attachment
On Mon, Jul 08, 2024 at 02:29:16PM -0500, Nathan Bossart wrote: > One thing that still bugs me is that there is no feedback sent to the user > when autovacuum_max_workers is set higher than autovacuum_worker_slots. I > think we should at least emit a WARNING, perhaps from the autovacuum > launcher, i.e., once when the launcher starts and then again as needed via > HandleAutoVacLauncherInterrupts(). Or we could fail to start in > PostmasterMain() and then ignore later misconfigurations via a GUC check > hook. I'm not too thrilled about adding more GUC check hooks that depend > on the value of other GUCs, but I do like the idea of failing instead of > silently proceeding with a different value than the user configured. Any > thoughts? From recent discussions, it sounds like there isn't much appetite for GUC check hooks that depend on the values of other GUCs. Here is a new version of the patch that adds the WARNING described above. -- nathan
Attachment
rebased -- nathan
Attachment
If there are no remaining concerns, I'd like to move forward with committing v9 in September's commitfest. -- nathan
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: tested, failed Documentation: not tested Hi, - Tested patch with check-world. - Verified CheckAutovacuumWorkerGUCs functionality and the correct WARNING was reported. - For feature specific testing, I created multiple tables and generated bloat. Expected behavior was witnessed. Lower autovacuum_worker_slots = 16 setting is better suited to start with. Thanks Yogesh