Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Combine Prune and Freeze records emitted by vacuum
Date
Msg-id CAAKRu_YFOyMF=p+cV7DzZ=+X_tsREO8vVucEVrV6uMrdSRW8AQ@mail.gmail.com
Whole thread Raw
In response to Re: Combine Prune and Freeze records emitted by vacuum  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Combine Prune and Freeze records emitted by vacuum
Re: Combine Prune and Freeze records emitted by vacuum
List pgsql-hackers
On Wed, Mar 27, 2024 at 12:18 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 27/03/2024 17:18, Melanie Plageman wrote:
> > I need some way to modify the control flow or accounting such that I
> > know which HEAPTUPLE_RECENTLY_DEAD tuples will not be marked LP_DEAD.
> > And a way to consider freezing and do live tuple accounting for these
> > and HEAPTUPLE_LIVE tuples exactly once.
>
> Just a quick update: I've been massaging this some more today, and I
> think I'm onto got something palatable. I'll send an updated patch later
> today, but the key is to note that for each item on the page, there is
> one point where we determine the fate of the item, whether it's pruned
> or not. That can happen in different points in in heap_page_prune().
> That's also when we set marked[offnum] = true. Whenever that happens, we
> all call one of the a heap_page_prune_record_*() subroutines. We already
> have those subroutines for when a tuple is marked as dead or unused, but
> let's add similar subroutines for the case that we're leaving the tuple
> unchanged. If we move all the bookkeeping logic to those subroutines, we
> can ensure that it gets done exactly once for each tuple, and at that
> point we know what we are going to do to the tuple, so we can count it
> correctly. So heap_prune_chain() decides what to do with each tuple, and
> ensures that each tuple is marked only once, and the subroutines update
> all the variables, add the item to the correct arrays etc. depending on
> what we're doing with it.

Yes, this would be ideal.

I was doing some experimentation with pageinspect today (trying to
find that single place where live tuples fates are decided) and it
seems like a heap-only tuple that is not HOT-updated will usually be
the one at the end of the chain. Which seems like it would be covered
by adding a record_live() type function call  in the loop of
heap_prune_chain():

        /*
         * If the tuple is not HOT-updated, then we are at the end of this
         * HOT-update chain.
         */
        if (!HeapTupleHeaderIsHotUpdated(htup))
        {
            heap_prune_record_live_or_recently_dead(dp, prstate,
offnum, presult);
            break;
        }

but that doesn't end up producing the same results as

        if (HeapTupleHeaderIsHeapOnly(htup)
            && !HeapTupleHeaderIsHotUpdated(htup) &&
presult->htsv[rootoffnum] == HEAPTUPLE_DEAD)
            heap_prune_record_live_or_recently_dead(dp, prstate,
offnum, presult);

at the top of heap_prune_chain().

- Melanie



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Add new error_action COPY ON_ERROR "log"
Next
From: Melanie Plageman
Date:
Subject: Re: Combine Prune and Freeze records emitted by vacuum