Re: autovacuum prioritization - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: autovacuum prioritization |
Date | |
Msg-id | CA+Tgmobq6Obh+Va_yT8qY=n72AV7vwYeuHw+dce44NA-xnzCHA@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum prioritization (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: autovacuum prioritization
|
List | pgsql-hackers |
On Thu, Jan 20, 2022 at 6:54 PM Peter Geoghegan <pg@bowt.ie> wrote: > I agree that it doesn't follow that table A should be more of a > priority than table B, either because it has a greater age, or because > its age happens to exceed some actually-arbitrary threshold. But I > will point out that my ongoing work on freezing does make something > along these lines much more plausible. As I said over on that thread, > there is now a kind of "natural variation" among tables, in terms of > relfrozenxid, as a result of tracking the actual oldest XID, and using > that (plus the emphasis on advancing relfrozenxid wherever possible). > And so we'll have a much better idea of what's going on with each > table -- it's typically a precise XID value from the table, from the > recent past. I agree. > Since we now have the failsafe, the scheduling algorithm can afford to > not give too much special attention to table age until we're maybe > over the 1 billion age mark -- or even 1.5 billion+. But once the > scheduling stuff starts to give table age special attention, it should > probably become the dominant consideration, by far, completely > drowning out any signals about bloat. It's kinda never really supposed > to get that high, so when we do end up there it is reasonable to fully > freak out. Unlike the bloat criteria, the wraparound safety criteria > doesn't seem to have much recognizable space between not worrying at > all, and freaking out. I do not agree with all of this. First, on general principle, I think sharp edges are bad. If a table had priority 0 for autovacuum 10 minutes ago, it can't now have priority one million bazillion. If you're saying that the priority of wraparound needs to, in the limit, become higher than any bloat-based priority, that is reasonable. Bloat never causes a hard stop in the way that wraparound does, even if the practical effects are not much different. However, if you're saying that the priority should shoot up to the maximum all at once, I don't agree with that at all. Second, I think it is good and appropriate to leave a lot of slop in the mechanism. As you point out later, we don't really know whether any of our estimates for how long things will take are accurate, and therefore we don't know whether the time we've budgeted will be sufficient. We need to leave lots of slop so that even if we turn out to be quite wrong, we don't hit a wall. Also, it's worth keeping in mind that waiting longer to freak out is not necessarily an advantage. It may well be that the only way the problem will ever get resolved is by human intervention - going in and fixing whatever dumb thing somebody did - e.g. resolving the pending prepared transaction. In that sense, we might be best off freaking out after a relatively small number of transactions, because that might get some human being's attention. In a very real sense, if old prepared transactions shut down the system after 100 million transactions, users would probably be better off on average, because the problems would get fixed before so much damage is done. I'm not seriously proposing that as a design, but I think it's a mistake to think that pushing off the day of reckoning is necessarily better. All that being said, I do agree that trying to keep the table age below 300 million is too conservative. I think we need to be conservative now because we don't take the time that the table will take to vacuum into account, and I think if we start thinking about it as a target to finish vacuuming rather than a target to start vacuuming, it can go significantly higher. But I would be disinclined to go to say, 1.5 billion. If the user hasn't taken any action when we hit the 1 billion transaction mark, or really probably a lot sooner, they're unlikely to wake up any time soon. I don't think there are many systems out there where vacuum ages >1b are the result of the system trying frantically to keep up and not having enough juice. There are probably some, but most such cases are the result of misconfiguration, user error, software failure, etc. > There is a related problem that you didn't mention: > autovacuum_max_workers controls how many autovacuum workers can run at > once, but there is no particular concern for whether or not running > that many workers actually makes sense, in any given scenario. As a > general rule, the system should probably be *capable* of running a > large number of autovacuums at the same time, but never actually do > that (because it just doesn't ever prove necessary). Better to have > the option and never use it than need it and not have it. I agree. And related to that, the more workers we have, the slower each one goes, which I think is often counterintuitive for people, and also often counterproductive. I'm sure there are cases where table A is really big and needs to be vacuumed but not terribly urgently, and table B is really small but needs to be vacuumed right now, and I/O bandwidth is really tight. In that case, slowing down the vacuum on table A so that the vacuum on table B can do its thing is the right call. But what I think is more common is that we get more workers because the first one is not getting the job done. And if they all get slower then we're still not getting the job done, but at greater expense. > > In the meantime, I think a sensible place to start would be to figure > > out some system that makes sensible estimates of how soon we need to > > address bloat, XID wraparound, and MXID wraparound for each table, and > > some system that estimates how long each one will take to vacuum. > > I think that it's going to be hard to model how long index vacuuming > will take accurately. And harder still to model which indexes will > adversely impact the user in some way if we delay vacuuming some more. Those are fair concerns. I assumed that if we knew the number of pages in the index, which we do, it wouldn't be too hard to make an estimate like this ... but you know more about this than I do, so tell me why you think that won't work. It's perhaps worth noting that even a somewhat poor estimate could be a big improvement over what we have now. > Might be more useful to start off by addressing how to spread out the > burden of vacuuming over time. The needs of queries matters, but > controlling costs matters too. > > One of the most effective techniques is to manually VACUUM when the > system is naturally idle, like at night time. If that could be > quasi-automated, or if the criteria used by autovacuum scheduling gave > just a little weight to how busy the system is right now, then we > would have more slack when the system becomes very busy. I have thought about this approach but I'm not very hopeful about it as a development direction. One problem is that we don't necessarily know when the quiet times are, and another is that there might not even be any quiet times. Still, neither of those problems by itself would discourage me from attempting something in this area. The thing that does discourage me is: if you have a quiet period, you can take advantage of that to do vacuuming without any code changes at all. You can just crontab a vacuum that runs with a reduced setting for vacuum_freeze_table_age and vacuum_freeze_min_age during your nightly quiet period and call it good. The problem that I'm principally concerned about here is the case where somebody had a system that was basically OK and then at some point, bad things started to happen. At some point they realize they're in trouble and try to get back on track. Very often, autovacuum is actually the enemy in that situation: it insists on consuming resources to vacuum the wrong stuff. Whatever we can do to avoid such disastrous situations is all to the good, but since we can't realistically expect to avoid them entirely, we need to improve the behavior in the cases where they do happen. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: