Re: Fixes for two separate bugs in nbtree VACUUM's page deletion - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Fixes for two separate bugs in nbtree VACUUM's page deletion |
Date | |
Msg-id | CA+fd4k6wCqk-YihAdXk-a6d+o8q8WJ=XiY7yVfOBEooYMENc2g@mail.gmail.com Whole thread Raw |
In response to | Re: Fixes for two separate bugs in nbtree VACUUM's page deletion (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Fixes for two separate bugs in nbtree VACUUM's page deletion
|
List | pgsql-hackers |
On Thu, 30 Apr 2020 at 07:29, Peter Geoghegan <pg@bowt.ie> wrote: > > On Tue, Apr 28, 2020 at 12:21 AM Masahiko Sawada > <masahiko.sawada@2ndquadrant.com> wrote: > > For the first fix it seems better to push down the logic to the page > > deletion code as your 0001 patch does so. The following change changes > > the page deletion code so that it emits LOG message indicating the > > index corruption when a deleted page is passed. But we used to ignore > > in this case. > > Attached is v2 of the patch set, which includes a new patch that I > want to target the master branch with. This patch (v2-0003-*) > refactors btvacuumscan(). > > This revision also changed the first bugfix patch. I have simplified > some of the details in nbtree.c that were added by commit 857f9c36cda. > Can't we call _bt_update_meta_cleanup_info() at a lower level, like in > the patch? AFAICT it makes more sense to just call it in > btvacuumscan(). Please let me know what you think of those changes. Yes, I agree. > > The big change in the new master-only refactoring patch (v2-0003-*) is > that I now treat a call to btvacuumpage() that has to > "backtrack/recurse" and doesn't find a page that looks like the > expected right half of a page split (a page split that occurred since > the VACUUM operation began) as indicating corruption. This corruption > is logged. I believe that we should never see this happen, for reasons > that are similar to the reasons why _bt_pagedel() never finds a > deleted page when moving right, as I went into in my recent e-mail to > this thread. I think that this is worth tightening up for Postgres 13. > > I will hold off on committing v2-0003-*, since I need to nail down the > reasoning for treating this condition as corruption. Plus it's not > urgent. I think that the general direction taken in v2-0003-* is the > right one, in any case. The "recursion" in btvacuumpage() doesn't make > much sense -- it obscures what's really going on IMV. Also, using the > variable name "scanblkno" at every level -- btvacuumscan(), > btvacuumpage(), and _bt_pagedel() -- makes it clear that it's the same > block at all levels of the code. And that the "backtrack/recursion" > case doesn't kill tuples from the "scanblkno" block (and doesn't > attempt to call _bt_pagedel(), which is designed only to handle blocks > passed by btvacuumpage() when they are the current "scanblkno"). It > seems unlikely that the VACUUM VERBOSE pages deleted accounting bug > would have happened if the code was already structured this way. +1 for refactoring. I often get confused that btvacuumpage() takes two block numbers (blkno and orig_blkno) in spite of the same value. Your change also makes it clear. For the part of treating that case as an index corruption I will need some time to review because of lacking knowledge of btree indexes. So I'll review it later. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: