Re: cost delay brainstorming - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: cost delay brainstorming |
Date | |
Msg-id | 20240617213719.xls5fdbgylgnjokg@awork3.anarazel.de Whole thread Raw |
In response to | cost delay brainstorming (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: cost delay brainstorming
|
List | pgsql-hackers |
Hi, On 2024-06-17 15:39:27 -0400, Robert Haas wrote: > As I mentioned in my talk at 2024.pgconf.dev, I think that the biggest > problem with autovacuum as it exists today is that the cost delay is > sometimes too low to keep up with the amount of vacuuming that needs > to be done. I agree it's a big problem, not sure it's *the* problem. But I'm happy to see it improved anyway, so it doesn't really matter. One issue around all of this is that we pretty much don't have the tools to analyze autovacuum behaviour across a larger number of systems in a realistic way :/. I find my own view of what precisely the problem is being heavily swayed by the last few problematic cases I've looked t. > I think we might able to get fairly far by observing that if the > number of running autovacuum workers is equal to the maximum allowable > number of running autovacuum workers, that may be a sign of trouble, > and the longer that situation persists, the more likely it is that > we're in trouble. So, a very simple algorithm would be: If the maximum > number of workers have been running continuously for more than, say, > 10 minutes, assume we're falling behind and exempt all workers from > the cost limit for as long as the situation persists. One could > criticize this approach on the grounds that it causes a very sudden > behavior change instead of, say, allowing the rate of vacuuming to > gradually increase. I'm curious to know whether other people think > that would be a problem. Another issue is that it's easy to fall behind due to cost limits on systems where autovacuum_max_workers is smaller than the number of busy tables. IME one common situation is to have a single table that's being vacuumed too slowly due to cost limits, with everything else keeping up easily. > I think it might be OK, for a couple of reasons: > > 1. I'm unconvinced that the vacuum_cost_delay system actually prevents > very many problems. I've fixed a lot of problems by telling users to > raise the cost limit, but virtually never by lowering it. When we > lowered the delay by an order of magnitude a few releases ago - > equivalent to increasing the cost limit by an order of magnitude - I > didn't personally hear any complaints about that causing problems. So > disabling the delay completely some of the time might just be fine. I have seen disabling cost limits cause replication setups to fall over because the amount of WAL increases beyond what can be replicated/archived/replayed. It's very easy to reach the issue when syncrep is enabled. > 1a. Incidentally, when I have seen problems because of vacuum running > "too fast", it's not been because it was using up too much I/O > bandwidth, but because it's pushed too much data out of cache too > quickly. A long overnight vacuum can evict a lot of pages from the > system page cache by morning - the ring buffer only protects our > shared_buffers, not the OS cache. I don't think this can be fixed by > rate-limiting vacuum, though: to keep the cache eviction at a level > low enough that you could be certain of not causing trouble, you'd > have to limit it to an extremely low rate which would just cause > vacuuming not to keep up. The cure would be worse than the disease at > that point. I've seen versions of this too. Ironically it's often made way worse by ringbuffers, because even if there is space is shared buffers, we'll not move buffers there, instead putting a lot of pressure on the OS page cache. > If you like this idea, I'd like to know that, and hear any further > thoughts you have about how to improve or refine it. If you don't, I'd > like to know that, too, and any alternatives you can propose, > especially alternatives that don't require crazy amounts of new > infrastructure to implement. I unfortunately don't know what to do about all of this with just the existing set of metrics :/. One reason that cost limit can be important is that we often schedule autovacuums on tables in useless ways, over and over. Without cost limits providing *some* protection against using up all IO bandwidth / WAL volume, we could end up doing even worse. Common causes of such useless vacuums I've seen: - Longrunning transaction prevents increasing relfrozenxid, we run autovacuum over and over on the same relation, using up the whole cost budget. This is particularly bad because often we'll not autovacuum anything else, building up a larger and larger backlog of actual work. - Tables, where on-access pruning works very well, end up being vacuumed far too frequently, because our autovacuum scheduling doesn't know about tuples having been cleaned up by on-access pruning. - Larger tables with occasional lock conflicts cause autovacuum to be cancelled and restarting from scratch over and over. If that happens before the second table scan, this can easily eat up the whole cost budget without making forward progress. Greetings, Andres Freund
pgsql-hackers by date: