On 30/03/2024 07:57, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:32:21PM -0400, Melanie Plageman wrote:
>> On Fri, Mar 29, 2024 at 12:00 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> Here's another idea: In the first loop through the offsets, where we
>>> gather the HTSV status of each item, also collect the offsets of all HOT
>>> and non-HOT items to two separate arrays. Call heap_prune_chain() for
>>> all the non-HOT items first, and then process any remaining HOT tuples
>>> that haven't been marked yet.
>>
>> That's an interesting idea. I'll try it out and see how it works.
>
> Attached v10 implements this method of dividing tuples into HOT and
> non-HOT and processing the potential HOT chains first then processing
> tuples not marked by calling heap_prune_chain().
>
> I have applied the refactoring of heap_prune_chain() to master and then
> built the other patches on top of that.
Committed some of the changes. Continuing to reviewing the rest.
> I discovered while writing this that LP_DEAD item offsets must be in
> order in the deadoffsets array (the one that is used to populate
> LVRelState->dead_items).
>
> When I changed heap_page_prune_and_freeze() to partition the offsets
> into HOT and non-HOT during the first loop through the item pointers
> array (where we get tuple visibility information), we add dead item
> offsets as they are encountered. So, they are no longer in order. I've
> added a quicksort of the deadoffsets array to satisfy vacuum.
Good catch.
> I think that we are actually successfully removing more RECENTLY_DEAD
> HOT tuples than in master with heap_page_prune()'s new approach, and I
> think it is correct; but let me know if I am missing something.
Yep. In the case of a two-item chain, RECENTLY_DEAD -> DEAD, the new
code can always remove both items. On 'master', it depends on which item
it happens to process first. If it processes the RECENTLY_DEAD item
first, then it follows the chain and removes both. But if it processes
the DEAD item first, the RECENTLY_DEAD item is left behind. It will be
removed by the next VACUUM, so it's not a correctness issue, and
probably doesn't make any practical performance difference either as
it's a rare corner case, but I feel better that it's more deterministic now.
> The early patches in the set include some additional comment cleanup as
> well. 0001 is fairly polished. 0004 could use some variable renaming
> (this patch partitions the tuples into HOT and not HOT and then
> processes them). I was struggling with some of the names here
> (chainmembers and chaincandidates is confusing).
I didn't understand why you wanted to juggle both partitions in the same
array. So I separated them into two arrays, and called them 'root_items'
and 'heaponly_items'.
In some micro-benchmarks, the order that the items were processed made a
measurable difference. So I'm processing the items in the reverse order.
That roughly matches the order the items are processed in master, as it
iterates the offsets from high-to-low in the first loop, and low-to-high
in the second loop.
> The bulk of the combining of pruning and freezing is lumped into 0010.
>
> I had planned to separate 0010 into 4 separate patches: 1 to execute
> freezing in heap_prune_chain(), 1 for the freeze heuristic approximating
> what is on master, and 1 for emitting a single record containing both
> the pruning and freezing page modifications.
>
> I ended up not doing this because I felt like the grouping of changes in
> 0007-0009 is off. As long as I still execute freezing in
> lazy_scan_prune(), I have to share lots of state between
> lazy_scan_prune() and heap_page_prune(). This meant I added a lot of
> parameters to heap_page_prune() that later commits removed -- making the
> later patches noisy and not so easy to understand.
>
> I'm actually not sure what should go in what commit (either for review
> clarity or for the actual final version).
>
> But, I think we should probably focus on review of the code and not as
> much how it is split up yet.
Yeah, that's fine, 0010 is manageable-sized now.
> The final state of the code could definitely use more cleanup. I've been
> staring at it for awhile, so I could use some thoughts/ideas about what
> part to focus on improving.
Committed some of the changes. I plan to commit at least the first of
these remaining patches later today. I'm happy with it now, but I'll
give it a final glance over after dinner.
I'll continue to review the rest after that, but attached is what I have
now.
--
Heikki Linnakangas
Neon (https://neon.tech)