Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date
Msg-id CAAKRu_ZgWXsMrv7RWgVOKesPdLWBoFy=LYSMh3PDGkbURM6dMw@mail.gmail.com
Whole thread Raw
In response to Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Emit fewer vacuum records by reaping removable tuples during pruning
Re: Emit fewer vacuum records by reaping removable tuples during pruning
List pgsql-hackers
On Fri, Jan 5, 2024 at 8:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jan 4, 2024 at 6:03 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > When a single page is being processed, page pruning happens in
> > heap_page_prune(). Freezing, dead items recording, and visibility
> > checks happen in lazy_scan_prune(). Visibility map updates and
> > freespace map updates happen back in lazy_scan_heap(). Except, if the
> > table has no indexes, in which case, lazy_scan_heap() also invokes
> > lazy_vacuum_heap_page() to set dead line pointers unused and do
> > another separate visibility check and VM update. I maintain that all
> > page-level processing should be done in the page-level processing
> > functions (like lazy_scan_prune()). And lazy_scan_heap() shouldn't be
> > directly responsible for special case page-level processing.
>
> But you can just as easily turn this argument on its head, can't you?
> In general, except for HOT tuples, line pointers are marked dead by
> pruning and unused by vacuum. Here you want to turn it on its head and
> make pruning do what would normally be vacuum's responsibility.

I actually think we are going to want to stop referring to these steps
as pruning and vacuuming. It is confusing because vacuuming refers to
the whole process of doing garbage collection on the table and also to
the specific step of setting dead line pointers unused. If we called
these steps say, pruning and reaping, that may be more clear.

Vacuuming consists of three phases -- the first pass, index vacuuming,
and the second pass. I don't think we should dictate what happens in
each pass. That is, we shouldn't expect only pruning to happen in the
first pass and only reaping to happen in the second pass. For example,
I think Andres has previously proposed doing another round of pruning
after index vacuuming. The second pass/third phase is distinguished
primarily by being after index vacuuming.

If we think about it this way, that frees us up to set dead line
pointers unused in the first pass when the table has no indexes. For
clarity, I could add a block comment explaining that doing this is an
optimization and not a logical requirement. One way to make this even
more clear would be to set the dead line pointers unused in a separate
loop after heap_prune_chain() as I proposed upthread.

- Melanie



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
Next
From: Joe Conway
Date:
Subject: Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG