On Fri, Mar 29, 2024 at 12:32:21PM -0400, Melanie Plageman wrote:
> On Fri, Mar 29, 2024 at 12:00 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> >
> > On 29/03/2024 07:04, Melanie Plageman wrote:
> > > On Thu, Mar 28, 2024 at 11:07:10AM -0400, Melanie Plageman wrote:
> > >> These comments could use another pass. I had added some extra
> > >> (probably redundant) content when I thought I was refactoring it a
> > >> certain way and then changed my mind.
> > >>
> > >> Attached is a diff with some ideas I had for a bit of code simplification.
> > >>
> > >> Are you working on cleaning this patch up or should I pick it up?
> > >
> > > Attached v9 is rebased over master. But, more importantly, I took
> > > another pass at heap_prune_chain() and am pretty happy with what I came
> > > up with. See 0021. I simplified the traversal logic and then grouped the
> > > chain processing into three branches at the end. I find it much easier
> > > to understand what we are doing for different types of HOT chains.
> >
> > Ah yes, agreed, that's nicer.
> >
> > The 'survivor' variable is a little confusing, especially here:
> >
> > if (!survivor)
> > {
> > int i;
> >
> > /*
> > * If no DEAD tuple was found, and the root is redirected, mark it as
> > * such.
> > */
> > if ((i = ItemIdIsRedirected(rootlp)))
> > heap_prune_record_unchanged_lp_redirect(prstate, rootoffnum);
> >
> > /* the rest of tuples in the chain are normal, unchanged tuples */
> > for (; i < nchain; i++)
> > heap_prune_record_unchanged(dp, prstate, chainitems[i]);
> > }
> >
> > You would think that "if(!survivor)", it means that there is no live
> > tuples on the chain, i.e. no survivors. But in fact it's the opposite;
> > it means that the are all live. Maybe call it 'ndeadchain' instead,
> > meaning the number of dead items in the chain.
>
> Makes sense.
I've done this in attached v10.
> > Here's another idea: In the first loop through the offsets, where we
> > gather the HTSV status of each item, also collect the offsets of all HOT
> > and non-HOT items to two separate arrays. Call heap_prune_chain() for
> > all the non-HOT items first, and then process any remaining HOT tuples
> > that haven't been marked yet.
>
> That's an interesting idea. I'll try it out and see how it works.
Attached v10 implements this method of dividing tuples into HOT and
non-HOT and processing the potential HOT chains first then processing
tuples not marked by calling heap_prune_chain().
I have applied the refactoring of heap_prune_chain() to master and then
built the other patches on top of that.
I discovered while writing this that LP_DEAD item offsets must be in
order in the deadoffsets array (the one that is used to populate
LVRelState->dead_items).
When I changed heap_page_prune_and_freeze() to partition the offsets
into HOT and non-HOT during the first loop through the item pointers
array (where we get tuple visibility information), we add dead item
offsets as they are encountered. So, they are no longer in order. I've
added a quicksort of the deadoffsets array to satisfy vacuum.
I think that we are actually successfully removing more RECENTLY_DEAD
HOT tuples than in master with heap_page_prune()'s new approach, and I
think it is correct; but let me know if I am missing something.
The early patches in the set include some additional comment cleanup as
well. 0001 is fairly polished. 0004 could use some variable renaming
(this patch partitions the tuples into HOT and not HOT and then
processes them). I was struggling with some of the names here
(chainmembers and chaincandidates is confusing).
The bulk of the combining of pruning and freezing is lumped into 0010.
I had planned to separate 0010 into 4 separate patches: 1 to execute
freezing in heap_prune_chain(), 1 for the freeze heuristic approximating
what is on master, and 1 for emitting a single record containing both
the pruning and freezing page modifications.
I ended up not doing this because I felt like the grouping of changes in
0007-0009 is off. As long as I still execute freezing in
lazy_scan_prune(), I have to share lots of state between
lazy_scan_prune() and heap_page_prune(). This meant I added a lot of
parameters to heap_page_prune() that later commits removed -- making the
later patches noisy and not so easy to understand.
I'm actually not sure what should go in what commit (either for review
clarity or for the actual final version).
But, I think we should probably focus on review of the code and not as
much how it is split up yet.
The final state of the code could definitely use more cleanup. I've been
staring at it for awhile, so I could use some thoughts/ideas about what
part to focus on improving.
- Melanie