Re: allow changing autovacuum_max_workers without restarting - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: allow changing autovacuum_max_workers without restarting |
Date | |
Msg-id | CA+TgmoY03ZU_BYPUDq6JCbdH0w0okB8_-tnjgvk8_6oBbf9mgw@mail.gmail.com Whole thread Raw |
In response to | Re: allow changing autovacuum_max_workers without restarting (Nathan Bossart <nathandbossart@gmail.com>) |
Responses |
Re: allow changing autovacuum_max_workers without restarting
Re: allow changing autovacuum_max_workers without restarting |
List | pgsql-hackers |
On Fri, Apr 19, 2024 at 11:43 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > Removed in v2. I also noticed that I forgot to update the part about when > autovacuum_max_workers can be changed. *facepalm* I think this could help a bunch of users, but I'd still like to complain, not so much with the desire to kill this patch as with the desire to broaden the conversation. Part of the underlying problem here is that, AFAIK, neither PostgreSQL as a piece of software nor we as human beings who operate PostgreSQL databases have much understanding of how autovacuum_max_workers should be set. It's relatively easy to hose yourself by raising autovacuum_max_workers to try to make things go faster, but produce the exact opposite effect due to how the cost balancing stuff works. But, even if you have the correct use case for autovacuum_max_workers, something like a few large tables that take a long time to vacuum plus a bunch of smaller ones that can't get starved just because the big tables are in the midst of being processed, you might well ask yourself why it's your job to figure out the correct number of workers. Now, before this patch, there is a fairly good reason for that, which is that we need to reserve shared memory resources for each autovacuum worker that might potentially run, and the system can't know how much shared memory you'd like to reserve for that purpose. But if that were the only problem, then this patch would probably just be proposing to crank up the default value of that parameter rather than introducing a second one. I bet Nathan isn't proposing that because his intuition is that it will work out badly, and I think he's right. I bet that cranking up the number of allowed workers will often result in running more workers than we really should. One possible negative consequence is that we'll end up with multiple processes fighting over the disk in a situation where they should just take turns. I suspect there are also ways that we can be harmed - in broadly similar fashion - by cost balancing. So I feel like what this proposal reveals is that we know that our algorithm for ramping up the number of running workers doesn't really work. And maybe that's just a consequence of the general problem that we have no global information about how much vacuuming work there is to be done at any given time, and therefore we cannot take any kind of sensible guess about whether 1 more worker will help or hurt. Or, maybe there's some way to do better than what we do today without a big rewrite. I'm not sure. I don't think this patch should be burdened with solving the general problem here. But I do think the general problem is worth some discussion. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: