Re: autovacuum prioritization - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: autovacuum prioritization |
Date | |
Msg-id | CA+TgmoadggMo0_xEHpjOcBqJgurTNOxKNoqmuMP77PrFdaX3Tw@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum prioritization (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: autovacuum prioritization
|
List | pgsql-hackers |
On Tue, Jan 25, 2022 at 3:32 PM Peter Geoghegan <pg@bowt.ie> wrote: > For example, a > page that has 5 dead heap-only tuples is vastly different to a similar > page that has 5 LP_DEAD items instead -- and yet our current approach > makes no distinction. Chances are very high that if the only dead > tuples are heap-only tuples, then things are going just fine on that > page -- opportunistic pruning is actually keeping up. Hmm, I think that's a really good insight. Perhaps we ought to forget about counting updates and deletes and instead count dead line pointers. Or maybe we can estimate the number of dead line pointers by knowing how many updates and deletes there were, as long as we can distinguish hot updates from non-HOT updates, which I think we can. > if successive ANALYZE operations notice > a consistent pattern where pages that had a non-zero number of LP_DEAD > items last time now have a significantly higher number, then it's a > good idea to err in the direction of more aggressive vacuuming. > *Growing* concentrations of LP_DEAD items signal chaos. I think that > placing a particular emphasis on pages with non-zero LP_DEAD items as > a qualitatively distinct category of page might well make sense -- > relatively few blocks with a growing number of LP_DEAD items seems > like it should be enough to make autovacuum run aggressively. I think measuring the change over time here might be fraught with peril. If vacuum makes a single pass over the indexes, it can retire as many dead line pointers as we have, or as will fit in memory, and the effort doesn't really depend too much on exactly how many dead line pointers we're trying to find. (I hear that it does depend more than you'd think ... but I still don't think that should be the dominant consideration here.) So to be efficient, we want to do that pass over the indexes when we have a suitably large batch of dead line pointers. I don't think it really depends on how long it took the batch to get to that size. I don't want to vacuum a terabyte of indexes with a much-smaller-than-normal batch of dead TIDs just because the number of dead TIDs seems to be increasing quickly at the moment: it's hard to imagine that the results will be worth the resources I'll have to expend to get there. On the other hand I also don't think I want to postpone vacuuming the indexes because the number is really big but not growing that fast. I feel like my threshold for the number of dead TIDs that ought to trigger a vacuum grows as the table gets bigger, capped by how much memory I've got. But I don't feel like the rate at which it's changing necessarily matters. Like if I create a million dead line pointers really quickly, wait a month, and then create another million dead line pointers, I feel like I want the system to respond just as aggressively as if the month-long delay were omitted. Maybe my feelings are wrong here. I'm just saying that, to me, it doesn't feel like the rate of change is all that relevant. > Even busy production DBs should usually only be vacuuming one large > table at a time. Also might make sense to strategically align the work > with the beginning of a new checkpoint. I'm not sure that either of those statements are correct. But on the other hand, I am also not sure that either of those statements are incorrect. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: