Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAH2-WznrGu+Xz9QqZBjrsoHNnYu62YT2GobEnO9k0JeGozZWjA@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Andres Freund <andres@anarazel.de>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
|
List | pgsql-hackers |
On Wed, Apr 14, 2021 at 6:53 PM Andres Freund <andres@anarazel.de> wrote: > > To a large degree the failsafe is something that is written in the > > hope that it will never be needed. This is unlike most other things, > > and has its own unique risks. > > Among them that the code is not covered by tests and is unlikely to be > meaningfully exercised within the beta timeframe due to the timeframes > for hitting it (hard to actually hit below a 1/2 day running extreme > workloads, weeks for more realistic ones). Which means that this code > has to be extra vigorously reviewed, not the opposite. There is test coverage for the optimization to bypass index vacuuming with very few dead tuples. Plus we can expect it to kick in very frequently. That's not as good as testing this other mechanism directly, which I agree ought to happen too. But the difference between those two cases is pretty much entirely around when and how they kick in. I have normalized the idea that index vacuuming is optional in principle, so in an important sense it is tested all the time. > Or at least > tests for it should be added (pg_resetwal + autovacuum_naptime=1s or > such should make that doable, or even just running a small test with > lower thresholds). You know what else doesn't have test coverage? Any kind of aggressive VACUUM. There is a problem with our culture around testing. I would like to address that in the scope of this project, but you know how it is. Can I take it that I'll have your support with adding those tests? > This line of argumentation scares me. Not explained arguments, about > running in conditions that we otherwise don't run in, when in > exceptional circumstances. This code has a history of super subtle > interactions, with quite a few data loss causing bugs due to us not > forseeing some combination of circumstances. I'll say it again: I was wrong to not make that clearer prior to committing the fixup. I regret that error, which probably had a lot to do with being fatigued. > I think there are good arguments for having logic for an "emergency > vacuum" mode (and also some good ones against). I'm not convinced that > the current set of things that are [not] skipped in failsafe mode is the > "obviously right set of things"™ but am convinced that there wasn't > enough consensus building o what that set of things is. This all also > would be different if it were the start of the development window, > rather than the end. I all but begged you to review the patches. Same with Robert. While the earlier patches (where almost all of the complexity is) did get review from both you and Robert (which I was grateful to receive), for whatever reason neither of you looked at the later patches in detail. (Robert said that the failsafe ought to cover single-pass/no-indexes VACUUM at one point, which did influence the design of the failsafe, but for the most part his input on the later stuff was minimal and expressed in general terms.) Of course, nothing stops us from improving the mechanism in the future. Though I maintain that the fundamental approach of finishing as quickly as possible is basically sound (short of fixing the problem directly, for example by obviating the need for freezing). > In my experience the big problem with vacuums in a wraparound situation > isn't actually things like truncation or eventhe index scans (although > they certainly can cause bad problems), but that VACUUM modifies > (prune/vacuum and WAL log or just setting hint bits) a crapton of pages > that don't actually need to be modified just to be able to get out of > the wraparound situation. Things like UUID indexes are very popular, and are likely to have an outsized impact on dirtying pages (which I agree is the real problem). Plus some people just have a ridiculous amount of indexes (e.g., the Discourse table that they pointed out as a particularly good target for deduplication had a total of 13 indexes). There is an excellent chance that stuff like that is involved in installations that actually have huge problems. The visibility map works pretty well these days -- but not for indexes. > And that the overhead of writing out all those > dirty pages + WAL logging causes the VACUUM to take unacceptably > long. E.g. because your storage is cloud storage with a few ms of > latency, and the ringbuffer + wal_buffer sizes cause so many synchronous > writes that you end up with < 10MB/s of data being processed. This is a false dichotomy. There probably is an argument for making the failsafe not do pruning that isn't strictly necessary (or something like that) in a future release. I don't see what particular significance that has for the failsafe mechanism now. The sooner we can advance relfrozenxid when it's dangerously far in the past, the better. It's true that the mechanism doesn't exploit every possible opportunity to do so. But it mostly manages to do that. > I think there's also a clear danger in having "cliffs" where the > behaviour changes appruptly once a certain threshold is reached. It's > not unlikely for systems to fall over entirely over when > > a) autovacuum cost limiting is disabled. E.g. reaching your disk > iops/throughput quota and barely being able to log into postgres > anymore to kill the stuck connection causing the wraparound issue. Let me get this straight: You're concerned that hurrying up vacuuming when we have 500 million XIDs left to burn will overwhelm the system, which would presumably have finished in time otherwise? Even though it would have to do way more work in absolute terms in the absence of the failsafe? And even though the 1.6 billion XID age that we got to before the failsafe kicked in was clearly not enough? You'd want to "play it safe", and stick with the original plan at that point? > b) No index cleanup happens anymore. E.g. a workload with a lot of > bitmap index scans (which do not support killtuples) could end up a > off a lot worse because index pointers to dead tuples aren't being > cleaned up. In cases where an old transaction or leftover replication > slot is causing the problem (together a significant percentage of > wraparound situations) this situation will persist across repeated > (explicit or automatic) vacuums for a table, because relfrozenxid > won't actually be advanced. And this in turn might actually end up > slowing resolution of the wraparound issue more than doing all the > index scans. If it's intrinsically impossible to advance relfrozenxid, then surely all bets are off. But even in this scenario it's very unlikely that we wouldn't at least do index vacuuming for those index tuples that are dead and safe to delete according to the OldestXmin cutoff. You still have 1.6 billion XIDs before the failsafe first kicks in, regardless of the issue of the OldestXmin/FreezeLimit being excessively far in the past. You're also not acknowledging the benefit of avoiding uselessly scanning the indexes again and again, which is mostly what would be happening in this scenario. Maybe VACUUM shouldn't spin like this at all, but that's not a new problem. > Because this is a hard cliff rather than something phasing in, it's not > really possible to for a user to see this slowly getting worse and > addressing the issue. Especially for a) this could be addressed by not > turning off cost limiting at once, but instead have it decrease the > closer you get to some limit. There is a lot to be said for keeping the behavior as simple as possible. You said so yourself. In any case I think that the perfect should not be the enemy of the good (or the better, at least). -- Peter Geoghegan
pgsql-hackers by date: