Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Decoupling antiwraparound autovacuum from special rules around auto cancellation |
Date | |
Msg-id | CAH2-Wz=S-R_2rO49Hm94Nuvhu9_twRGbTm6uwDRmRu-Sqn_t3w@mail.gmail.com Whole thread Raw |
Responses |
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
|
List | pgsql-hackers |
I think that we should decouple the PROC_VACUUM_FOR_WRAPAROUND autocancellation behavior in ProcSleep() from antiwraparound autovacuum itself. In other words I think that it should be possible to cancel an autovacuum that happens to be an antiwraparound autovacuum, just as if were any other autovacuum -- because it usually is no different in any real practical sense. Or at least it shouldn't be seen as fundamentally different to other autovacuums at first, before relfrozenxid attains an appreciably greater age (definition of "appreciably greater" is TBD). Why should the PROC_VACUUM_FOR_WRAPAROUND behavior happen on *exactly* the same timeline as the one used to launch an antiwraparound autovacuum, though? There is no inherent reason why we have to do both things at exactly the same XID-age-wise time. But there is reason to think that doing so could make matters worse rather than better [1]. More generally I think that it'll be useful to perform "aggressive behaviors" on their own timeline, with no two distinct aggressive behaviors applied at exactly the same time. In general we ought to give a less aggressive approach some room to succeed before escalating to a more aggressive approach -- we should see if a less aggressive approach will work on its own. The failsafe is the most aggressive intervention of all. The PROC_VACUUM_FOR_WRAPAROUND behavior is almost as aggressive, and should happen sooner. Antiwraparound autovacuum itself (which is really a separate thing to PROC_VACUUM_FOR_WRAPAROUND) is less aggressive still. Then you have things like the cutoffs in vacuumlazy.c that control things like freezing. In short, having an "escalatory" approach that applies each behavior at different times. The exact timelines we'd want are of course debatable, but the value of having multiple distinct timelines (one per aggressive behavior) is far less debatable. We should give problems a chance to "resolve themselves", at least up to a point. The latest version of my in progress VACUUM patch series [2] completely removes the concept of aggressive VACUUM as a discrete mode of operation inside vacuumlazy.c. Every existing "aggressive-ish behavior" will be retained in some form or other, but they'll be applied on separate timelines, in proportion to the problem at hand. For example, we'll have a separate XID cutoff for waiting for a cleanup lock the hard way -- we will no longer use FreezeLimit for that, since that doesn't give freezing a chance to happen in the next VACUUM. The same VACUUM operation that is the first one that is capable of freezing should ideally not *also* be the first one that has to wait for a cleanup lock. We should be willing to put off waiting for a cleanup lock for much longer than we're willing to put off freezing. Reusing the same cutoff just makes life harder. Clearly the idea of decoupling the PROC_VACUUM_FOR_WRAPAROUND behavior from antiwraparound autovacuum is conceptually related to my patch series, but it can be treated as separate work. That's why I'm starting another thread now. There is another idea in that patch series that also seems worth mentioning as relevant (but not essential) to this discussion on this thread: it would be better if antiwraparound autovacuum was simply another way to launch an autovacuum, which isn't fundamentally different to any other. I believe that users will find this conceptual model a lot easier, especially in a world where antiwraparound autovacuums naturally became rare (which is the world that the big patch series seeks to bring about). It'll make antiwraparound autovacuum "the threshold of last resort", only needed when conventional tuple-based thresholds don't trigger at all for an extended period of time (e.g., for static tables). Perhaps it won't be trivial to fix autovacuum.c in the way I have in mind (which is to split PROC_VACUUM_FOR_WRAPAROUND into two flags that serve two separate purposes). I haven't considered if we're accidentally relying on the coupling to avoid confusion within autovacuum.c. That doesn't seem important right now, though. [1] https://www.tritondatacenter.com/blog/manta-postmortem-7-27-2015 [2] https://postgr.es/m/CAH2-WzkU42GzrsHhL2BiC1QMhaVGmVdb5HR0_qczz0Gu2aSn=A@mail.gmail.com -- Peter Geoghegan
pgsql-hackers by date: