Re: Confine vacuum skip logic to lazy_scan_skip - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Confine vacuum skip logic to lazy_scan_skip |
Date | |
Msg-id | 20240308004614.xn7x3k6qz7jw7ljx@liskov Whole thread Raw |
In response to | Re: Confine vacuum skip logic to lazy_scan_skip (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Confine vacuum skip logic to lazy_scan_skip
Re: Confine vacuum skip logic to lazy_scan_skip |
List | pgsql-hackers |
On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote: > On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote: > > I made some further changes. I kept them as separate commits for easier > > review, see the commit messages for details. Any thoughts on those changes? > > I've given some inline feedback on most of the extra patches you added. > Short answer is they all seem fine to me except I have a reservations > about 0008 because of the number of blkno variables flying around. I > didn't have a chance to rebase these into my existing changes today, so > either I will do it tomorrow or, if you are feeling like you're on a > roll and want to do it, that also works! Attached v7 contains all of the changes that you suggested plus some additional cleanups here and there. > > I feel heap_vac_scan_get_next_block() function could use some love. Maybe > > just some rewording of the comments, or maybe some other refactoring; not > > sure. But I'm pretty happy with the function signature and how it's called. I've cleaned up the comments on heap_vac_scan_next_block() in the first couple patches (not so much in the streaming read user). Let me know if it addresses your feelings or if I should look for other things I could change. I will say that now all of the variable names are *very* long. I didn't want to remove the "state" from LVRelState->next_block_state. (In fact, I kind of miss the "get". But I had to draw the line somewhere.) I think without "state" in the name, next_block sounds too much like a function. Any ideas for shortening the names of next_block_state and its members or are you fine with them? > I was wondering if we should remove the "get" and just go with > heap_vac_scan_next_block(). I didn't do that originally because I didn't > want to imply that the next block was literally the sequentially next > block, but I think maybe I was overthinking it. > > Another idea is to call it heap_scan_vac_next_block() and then the order > of the words is more like the table AM functions that get the next block > (e.g. heapam_scan_bitmap_next_block()). Though maybe we don't want it to > be too similar to those since this isn't a table AM callback. I've done a version of this. > > From 27e431e8dc69bbf09d831cb1cf2903d16f177d74 Mon Sep 17 00:00:00 2001 > > From: Heikki Linnakangas <heikki.linnakangas@iki.fi> > > Date: Wed, 6 Mar 2024 20:58:57 +0200 > > Subject: [PATCH v6 6/9] Move vmbuffer back to a local varible in > > lazy_scan_heap() > > > > It felt confusing that we passed around the current block, 'blkno', as > > an argument to lazy_scan_new_or_empty() and lazy_scan_prune(), but > > 'vmbuffer' was accessed directly in the 'scan_state'. > > > > It was also a bit vague, when exactly 'vmbuffer' was valid. Calling > > heap_vac_scan_get_next_block() set it, sometimes, to a buffer that > > might or might not contain the VM bit for 'blkno'. But other > > functions, like lazy_scan_prune(), assumed it to contain the correct > > buffer. That was fixed up visibilitymap_pin(). But clearly it was not > > "owned" by heap_vac_scan_get_next_block(), like the other 'scan_state' > > fields. > > > > I moved it back to a local variable, like it was. Maybe there would be > > even better ways to handle it, but at least this is not worse than > > what we have in master currently. > > I'm fine with this. I did it the way I did (grouping it with the > "next_unskippable_block" in the skip struct), because I think that this > vmbuffer is always the buffer containing the VM bit for the next > unskippable block -- which sometimes is the block returned by > heap_vac_scan_get_next_block() and sometimes isn't. > > I agree it might be best as a local variable but perhaps we could retain > the comment about it being the block of the VM containing the bit for the > next unskippable block. (Honestly, the whole thing is very confusing). In 0001-0004 I've stuck with only having the local variable vmbuffer in lazy_scan_heap(). In 0006 (introducing pass 1 vacuum streaming read user) I added a vmbuffer back to the next_block_state (while also keeping the local variable vmbuffer in lazy_scan_heap()). The vmbuffer in lazy_scan_heap() contains the block of the VM containing visi information for the next unskippable block or for the current block if its visi information happens to be in the same block of the VM as either 1) the next unskippable block or 2) the most recently processed heap block. Streaming read vacuum separates this visibility check in heap_vac_scan_next_block() from the main loop of lazy_scan_heap(), so we can't just use a local variable anymore. Now the local variable vmbuffer in lazy_scan_heap() will only already contain the block with the visi information for the to-be-processed block if it happens to be in the same VM block as the most recently processed heap block. That means potentially more VM fetches. However, by adding a vmbuffer to next_block_state, the callback may be able to avoid extra VM fetches from one invocation to the next. Note that next_block->current_block in the streaming read vacuum context is actually the prefetch block. - Melanie
Attachment
- v7-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patch
- v7-0002-Add-lazy_scan_skip-next-block-state-to-LVRelState.patch
- v7-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patch
- v7-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patch
- v7-0005-Streaming-Read-API.patch
- v7-0006-Vacuum-first-pass-uses-Streaming-Read-interface.patch
- v7-0007-Vacuum-second-pass-uses-Streaming-Read-interface.patch
pgsql-hackers by date: