Re: autovacuum prioritization - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: autovacuum prioritization |
Date | |
Msg-id | CA+TgmoaHFPtZgVSF3RxUzQHz69aAU1w6ekenCaM57pjmP0EMRw@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum prioritization (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: autovacuum prioritization
Re: autovacuum prioritization |
List | pgsql-hackers |
On Mon, Jan 24, 2022 at 11:14 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > I think we need some more parameters to compare bloat vs wraparound. > I mean in one of your examples in the 2nd paragraph we can say that > the need-to-start of table A is earlier than table B so it's kind of > simple. But when it comes to wraparound vs bloat we need to add some > weightage to compute how much bloat is considered as bad as > wraparound. I think the amount of bloat can not be an absolute number > but it should be relative w.r.t the total database size or so. I > don't think it can be computed w.r.t to the table size because if the > table is e.g. just 1 GB size and it is 5 times bloated then it is not > as bad as another 1 TB table which is just 2 times bloated. Thanks for writing back. I don't think that I believe the last part of this argument, because it seems to suppose that the big problem with bloat is that it might use up disk space, whereas in my experience the big problem with bloat is that it slows down access to your data. Yet the dead space in some other table will not have much impact on the speed of access to the current table. In fact, if most accesses to the table are index scans, even dead space in the current table may not have much effect, but sequential scans are bound to notice. It's true that, on a cluster-wide basis, every dead page is one more page that can potentially take up space in cache, so in that sense the performance consequences are global to the whole cluster. However, that effect is more indirect and takes a long time to become a big problem. The direct effect of having to read more pages to execute the same query plan causes problems a lot sooner. But your broader point that we need to consider how much bloat represents a problem is a really good one. In the past, one rule that I've thought about is: if we're vacuuming a table and we're not going to finish before it needs to be vacuumed again, then we should vacuum faster (i.e. in effect, increase the cost limit on the fly). That might still not result in good behavior, but it would at least result in behavior that is less bad. However, it doesn't really answer the question of how we decide when to start the very first VACUUM. I don't really know the answer to that question. The current heuristics result in estimates of acceptable bloat that are too high in some cases and too low in others. I've seen tables that got bloated vastly beyond what autovacuum is configured to tolerate before they caused any real difficulty, and I know there are other cases where users start to suffer long before those thresholds are reached. At the moment, the best idea I have is to use something like the current algorithm, but treat it as a deadline (keep bloat below this amount) rather than an initiation criteria (start when you reach this amount). But I think that idea is a bit weak; maybe there's something better out there. > I think we should be thinking of dynamically adjusting priority as > well. Because it is possible that when autovacuum started we > prioritize the table based on some statistics and estimation but > vacuuming process can take long time and during that some priority > might change so during the start of the autovacuum if we push all > table to some priority queue and simply vacuum in that order then we > might go wrong somewhere. Yep. I think we should reassess what to do next after each table. Possibly making some exception for really small tables - e.g. if we last recomputed priorities less than 1 minute ago, don't do it again. > I think we need to make different priority > queues based on different factors, for example 1 queue for wraparound > risk and another for bloat risk. I don't see why we want multiple queues. We have to answer the question "what should we do next?" which requires us, in some way, to funnel everything into a single prioritization. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: