Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Combine Prune and Freeze records emitted by vacuum
Date
Msg-id e7b3116c-79b0-4864-a8d2-5b59b498f724@iki.fi
Whole thread Raw
In response to Re: Combine Prune and Freeze records emitted by vacuum  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Combine Prune and Freeze records emitted by vacuum
Re: Combine Prune and Freeze records emitted by vacuum
List pgsql-hackers
On 30/03/2024 07:57, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:32:21PM -0400, Melanie Plageman wrote:
>> On Fri, Mar 29, 2024 at 12:00 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> Here's another idea: In the first loop through the offsets, where we
>>> gather the HTSV status of each item, also collect the offsets of all HOT
>>> and non-HOT items to two separate arrays. Call heap_prune_chain() for
>>> all the non-HOT items first, and then process any remaining HOT tuples
>>> that haven't been marked yet.
>>
>> That's an interesting idea. I'll try it out and see how it works.
> 
> Attached v10 implements this method of dividing tuples into HOT and
> non-HOT and processing the potential HOT chains first then processing
> tuples not marked by calling heap_prune_chain().
> 
> I have applied the refactoring of heap_prune_chain() to master and then
> built the other patches on top of that.

Committed some of the changes. Continuing to reviewing the rest.

> I discovered while writing this that LP_DEAD item offsets must be in
> order in the deadoffsets array (the one that is used to populate
> LVRelState->dead_items).
> 
> When I changed heap_page_prune_and_freeze() to partition the offsets
> into HOT and non-HOT during the first loop through the item pointers
> array (where we get tuple visibility information), we add dead item
> offsets as they are encountered. So, they are no longer in order. I've
> added a quicksort of the deadoffsets array to satisfy vacuum.

Good catch.

> I think that we are actually successfully removing more RECENTLY_DEAD
> HOT tuples than in master with heap_page_prune()'s new approach, and I
> think it is correct; but let me know if I am missing something.

Yep. In the case of a two-item chain, RECENTLY_DEAD -> DEAD, the new 
code can always remove both items. On 'master', it depends on which item 
it happens to process first. If it processes the RECENTLY_DEAD item 
first, then it follows the chain and removes both. But if it processes 
the DEAD item first, the RECENTLY_DEAD item is left behind. It will be 
removed by the next VACUUM, so it's not a correctness issue, and 
probably doesn't make any practical performance difference either as 
it's a rare corner case, but I feel better that it's more deterministic now.

> The early patches in the set include some additional comment cleanup as
> well. 0001 is fairly polished. 0004 could use some variable renaming
> (this patch partitions the tuples into HOT and not HOT and then
> processes them). I was struggling with some of the names here
> (chainmembers and chaincandidates is confusing).

I didn't understand why you wanted to juggle both partitions in the same 
array. So I separated them into two arrays, and called them 'root_items' 
and 'heaponly_items'.

In some micro-benchmarks, the order that the items were processed made a 
measurable difference. So I'm processing the items in the reverse order. 
That roughly matches the order the items are processed in master, as it 
iterates the offsets from high-to-low in the first loop, and low-to-high 
in the second loop.

> The bulk of the combining of pruning and freezing is lumped into 0010.
> 
> I had planned to separate 0010 into 4 separate patches: 1 to execute
> freezing in heap_prune_chain(), 1 for the freeze heuristic approximating
> what is on master, and 1 for emitting a single record containing both
> the pruning and freezing page modifications.
> 
> I ended up not doing this because I felt like the grouping of changes in
> 0007-0009 is off. As long as I still execute freezing in
> lazy_scan_prune(), I have to share lots of state between
> lazy_scan_prune() and heap_page_prune(). This meant I added a lot of
> parameters to heap_page_prune() that later commits removed -- making the
> later patches noisy and not so easy to understand.
> 
> I'm actually not sure what should go in what commit (either for review
> clarity or for the actual final version).
> 
> But, I think we should probably focus on review of the code and not as
> much how it is split up yet.

Yeah, that's fine, 0010 is manageable-sized now.

> The final state of the code could definitely use more cleanup. I've been
> staring at it for awhile, so I could use some thoughts/ideas about what
> part to focus on improving.

Committed some of the changes. I plan to commit at least the first of 
these remaining patches later today. I'm happy with it now, but I'll 
give it a final glance over after dinner.

I'll continue to review the rest after that, but attached is what I have 
now.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: Security lessons from liblzma
Next
From: "Leung, Anthony"
Date:
Subject: Re: Allow non-superuser to cancel superuser tasks.