Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date
Msg-id CAH2-Wzma8oAWF6DmzTJCr-fMgH4K=V0GWce97zO3tnwCauT1AA@mail.gmail.com
Whole thread Raw
In response to Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Emit fewer vacuum records by reaping removable tuples during pruning
List pgsql-hackers
On Thu, Jan 18, 2024 at 10:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I think we're agreeing but I want to be sure. If we only set LP_DEAD
> items to LP_UNUSED, that frees no space. But if doing so allows us to
> truncate the line pointer array, that that frees a little bit of
> space. Right?

That's part of it, yes.

> One problem with using this as a justification for the status quo is
> that truncating the line pointer array is a relatively recent
> behavior. It's certainly much newer than the choice to have VACUUM
> touch the FSM in the second page than the first page.

True. But the way that PageGetHeapFreeSpace() returns 0 for a page
with 291 LP_DEAD stubs is a much older behavior. When that happens it
is literally true that the page has lots of free space. And yet it's
not free space we can actually use. Not until those LP_DEAD items are
marked LP_UNUSED.

> Another problem is that the amount of space that we're freeing up in
> the second pass is really quite minimal even when it's >0. Any tuple
> that actually contains any data at all is at least 32 bytes, and most
> of them are quite a bit larger. Item pointers are 2 bytes. To save
> enough space to fit even one additional tuple, we'd have to free *at
> least* 16 line pointers. That's going to be really rare.

I basically agree with this. I would still worry about the "291
LP_DEAD stubs makes PageGetHeapFreeSpace return 0" thing specifically,
though. It's sort of a special case.

> And even if it happens, is it even useful to advertise that free
> space? Do we want to cram one more tuple into a page that has a
> history of extremely heavy updates? Could it be that it's smarter to
> just forget about that free space?

I think so, yes.

Another big source of inaccuracies here is that we don't credit
RECENTLY_DEAD tuple space with being free space. Maybe that isn't a
huge problem, but it makes it even harder to believe that precision in
FSM accounting is an intrinsic good.

> > You'd likely prefer a simpler argument for doing this -- an argument
> > that doesn't require abandoning/discrediting the idea that a high
> > degree of FSM_CATEGORIES-wise precision is a valuable thing. Not sure
> > that that's possible -- the current design is at least correct on its
> > own terms. And what you propose to do will probably be less correct on
> > those same terms, silly though they are.
>
> I've never really understood why you think that the number of
> FSM_CATEGORIES is the problem. I believe I recall you endorsing a
> system where pages are open or closed, to try to achieve temporal
> locality of data.

My remarks about "FSM_CATEGORIES-wise precision" were basically
remarks about the fundamental problem with the free space map. Which
is really that it's just a map of free space, that gives exactly zero
thought to various high level things that *obviously* matter. I wasn't
particularly planning on getting into the specifics of that with you
now, on this thread.

A brief recap might be useful: other systems with a heap table AM free
space management structure typically represent the free space
available on each page using a far more coarse grained counter.
Usually one with less than 10 distinct increments. The immediate
problem with FSM_CATEGORIES having such a fine granularity is that it
increases contention/competition among backends that need to find some
free space for a new tuple. They'll all diligently try to find the
page with the least free space that still satisfies their immediate
needs -- there is no thought for the second-order effects, which are
really important in practice.

> But all of that is just an argument that reducing the number of
> FSM_CATEGORIES is *acceptable*; it doesn't amount to an argument that
> it's better. My current belief is that it isn't better, just a vehicle
> to do something else that maybe is better, like squeezing open/closed
> tracking or similar into the existing bit space. My understanding is
> that you think it would be better on its own terms, but I have not yet
> been able to grasp why that would be so.

I'm not really arguing that reducing FSM_CATEGORIES and changing
nothing else would be better on its own (it might be, but that's not
what I meant to convey).

What I really wanted to convey is this: if you're going to go the
route of ignoring LP_DEAD free space during vacuuming, you're
conceding that having a high degree of precision about available free
space isn't actually useful (or wouldn't be useful if it was actually
possible at all). Which is something that I generally agree with. I'd
just like it to be clear that you/Melanie are in fact taking one small
step in that direction. We don't need to discuss possible later steps
beyond that first step. Not right now.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Increasing IndexTupleData.t_info from uint16 to uint32
Next
From: Aleksander Alekseev
Date:
Subject: Re: UUID v7