Thread: Re: nbtree VACUUM's REDO routine doesn't clear page's VACUUM cycle ID
Re: nbtree VACUUM's REDO routine doesn't clear page's VACUUM cycle ID
From
"Andrey M. Borodin"
Date:
> On 15 Nov 2024, at 21:33, Peter Geoghegan <pg@bowt.ie> wrote: > > Attached patch teaches btree_xlog_vacuum, nbtree VACUUM's REDO > routine, to reset the target page's opaque->btpo_cycleid to 0. This > makes the REDO routine match original execution, which seems like a > good idea on consistency grounds. > > I propose this for the master branch only. The change seems correct to me: anyway cycle must be less than cycle of any future vacuum after promotion. I cannot say anythingabout beauty of resetting or not resetting the field. I'd suggest renaming the field into something like "btpo_split_vaccycleid". I was aware of index vacuum backtracking, butit took me a while to build context again. Best regards, Andrey Borodin.
On Wed, Nov 20, 2024 at 4:41 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote: > > On 15 Nov 2024, at 21:33, Peter Geoghegan <pg@bowt.ie> wrote: > > I propose this for the master branch only. > > The change seems correct to me: anyway cycle must be less than cycle of any future vacuum after promotion. The cycles set in the page special area during page splits that happen to run while a VACUUM also runs must use that same VACUUM's cycle ID (which is stored in shared memory for the currently running VACUUM). That way the VACUUM will know when it must backtrack later on, to avoid missing index tuples that it is expected to remove. It doesn't matter if the cycle_id that VACUUM sees is less than or greater than its own one -- only that it matches its own one when it needs to match to get correct behavior from VACUUM. (Though it's also possible to get a false positive, in rare cases where we get unlucky and there's a collision. This might waste cycles within VACUUM, but it shouldn't lead to truly incorrect behavior.) > I cannot say anything about beauty of resetting or not resetting the field. The main reason to do this is simple: we should make REDO routines match original execution, unless there is a very good reason not to do so -- original execution is "authoritative". There's no good reason not to do so here. There is a secondary reason to do things this way, though: resetting the cycle ID within the REDO routine can save future work from within btvacuumpage, after a standby gets promoted, when nbtree VACUUM runs against an affected index for the first time. Note that nbtree VACUUM goes out of its way to clear an old cycle ID in passing, even when it has nothing else to do on the page. And so by taking care of that on the REDO side, right from the start, the newly promoted standby won't need to dirty so many pages during the first nbtree VACUUM after it is promoted. (Actually, it's worse than that -- we're not just dirtying pages these days. It's quite likely that we'll need to write a WAL record to clear the cycle ID, since page-level checksums are enabled by default.) > I'd suggest renaming the field into something like "btpo_split_vaccycleid". I was aware of index vacuum backtracking, butit took me a while to build context again. I'm not inclined to change it now. It's been like this for many years now. Pushed this patch just now. Thanks for the review! -- Peter Geoghegan