Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date
Msg-id CAAKRu_Z0yGVT-sc-_1Pm9n7oTECLvg1SoJ4iqYQCdU5sY0g7TQ@mail.gmail.com
Whole thread Raw
In response to Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Emit fewer vacuum records by reaping removable tuples during pruning
List pgsql-hackers
On Wed, Jan 24, 2024 at 2:59 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jan 18, 2024 at 9:23 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I have attached a rebased version of the former 0004 as v11-0001.
>
> This looks correct to me, although I wouldn't mind some more eyes on
> it. However, I think that the comments still need more work.
>
> Specifically:
>
>          /*
>           * Prune, freeze, and count tuples.
>           *
>           * Accumulates details of remaining LP_DEAD line pointers on page in
>           * dead_items array.  This includes LP_DEAD line pointers that we
>           * pruned ourselves, as well as existing LP_DEAD line pointers that
>           * were pruned some time earlier.  Also considers freezing XIDs in the
>           * tuple headers of remaining items with storage. It also determines
>           * if truncating this block is safe.
>           */
> -        lazy_scan_prune(vacrel, buf, blkno, page,
> -                        vmbuffer, all_visible_according_to_vm,
> -                        &has_lpdead_items);
> +        if (got_cleanup_lock)
> +            lazy_scan_prune(vacrel, buf, blkno, page,
> +                            vmbuffer, all_visible_according_to_vm,
> +                            &has_lpdead_items);
>
> I think this comment needs adjusting. Possibly, the preceding calls to
> lazy_scan_noprune() and lazy_scan_new_or_empty() could even use a bit
> better comments, but in those cases, you're basically keeping the same
> code with the same comment, so it's kinda defensible. Here, though,
> you're making the call conditional without any comment update.
>
>          /*
>           * Final steps for block: drop cleanup lock, record free space in the
>           * FSM.
>           *
>           * If we will likely do index vacuuming, wait until
>           * lazy_vacuum_heap_rel() to save free space. This doesn't just save
>           * us some cycles; it also allows us to record any additional free
>           * space that lazy_vacuum_heap_page() will make available in cases
>           * where it's possible to truncate the page's line pointer array.
>           *
> +         * Our goal is to update the freespace map the last time we touch the
> +         * page. If the relation has no indexes, or if index vacuuming is
> +         * disabled, there will be no second heap pass; if this particular
> +         * page has no dead items, the second heap pass will not touch this
> +         * page. So, in those cases, update the FSM now.
> +         *
>           * Note: It's not in fact 100% certain that we really will call
>           * lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index
>           * vacuuming (and so must skip heap vacuuming).  This is deemed okay
>           * because it only happens in emergencies, or when there is very
>           * little free space anyway. (Besides, we start recording free space
>           * in the FSM once index vacuuming has been abandoned.)
>           */
>
> I think this comment needs a rewrite, not just sticking the other
> comment in the middle of it. There's some duplication between these
> two comments, and merging it all together should iron that out.
> Personally, I think my comment (which was there before, this commit
> only moves it here) is clearer than what's already here about the
> intent, but it's lacking some details that are captured in the other
> two paragraphs, and we probably don't want to lose those details.
>
> If you'd like, I can try rewriting these comments to my satisfaction
> and you can reverse-review the result. Or you can rewrite them and
> I'll re-review the result. But I think this needs to be a little less
> mechanical. It's not just about shuffling all the comments around so
> that all the text ends up somewhere -- we also need to consider the
> degree to which the meaning becomes duplicated when it all gets merged
> together.

I will take a stab at rewriting the comments myself first. Usually, I
try to avoid changing comments if the code isn't functionally
different because I know it adds additional review overhead and I try
to reduce that to an absolute minimum. However, I see what you are
saying and agree that it would be better to have actually good
comments instead of frankenstein comments made up of parts that were
previously considered acceptable. I'll have a new version ready by
tomorrow.

- Melanie



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Patch: Improve Boolean Predicate JSON Path Docs
Next
From: Sergey Prokhorenko
Date:
Subject: Re: UUID v7