Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date
Msg-id CAAKRu_aDrqMCmjQCLyvCYtOrf=ohr7+8fAVhVpCSGPHouH9FRQ@mail.gmail.com
Whole thread Raw
In response to Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Fri, Jan 12, 2024 at 1:07 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Fri, Jan 12, 2024 at 12:33 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > So, I think this is the logic in master:
> >
> > Prune case, first pass
> >
> > ...
> > - indexes > 0 && (!space_freed || !index_vacuuming) -> update FSM
>
> What is "space_freed"? Isn't that something from your uncommitted patch?
>
> As I said, the aim is to call PageGetHeapFreeSpace() (*not*
> PageGetFreeSpace(), which is only used for index pages) exactly once
> per heap page scanned. This is supposed to happen independently of
> whatever specific work was/will be required for the heap page. In
> general, we don't ever trust that the FSM is already up-to-date.
> Presumably because the FSM isn't crash safe.
>
> On master, prunestate.has_lpdead_items may be set true when our VACUUM
> wasn't actually the thing that performed pruning that freed tuple
> storage -- typically when some other backend was the one that did all
> required pruning at some earlier point in time, often via
> opportunistic pruning. For better or worse, the only thing that VACUUM
> aims to do is make sure that PageGetHeapFreeSpace() gets called
> exactly once per scanned page.
...
> > I still don't understand why vacuum is responsible for updating the
> > FSM per page when no line pointers have been set unused. That is how
> > PageGetFreeSpace() figures out if there is free space, right?
>
> You mean PageGetHeapFreeSpace? Not really. (Though even pruning can
> set line pointers unused, or heap-only tuples.)
>
> Even if pruning doesn't happen in VACUUM, that doesn't mean that the
> FSM is up-to-date.
>
> In short, we do these things with the free space map because it is a
> map of free space (which isn't crash safe) -- nothing more. I happen
> to agree that that general design has a lot of problems, but those
> seem out of scope here.

So, there are 3 issues I am trying to understand:

1) How often should vacuum update the FSM (not vacuum as in the second
pass but vacuum as in the whole thing that is happening in
lazy_scan_heap())?
2) What is the exact logic in master that ensures that vacuum
implements the cadence in 1)?
3) How can the logic in 2) be replicated exactly in my patch that sets
would-be dead items LP_UNUSED during pruning?

From what Peter is saying, I think 1) is decided and is once per page
(across all passes)
For 2), see my previous email. And for 3), TBD until 2) is agreed upon.

- Melanie



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Next
From: Jim Nasby
Date:
Subject: Re: Confine vacuum skip logic to lazy_scan_skip