Home > mailing lists

Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date	January 5, 2024 20:57:30
Msg-id	20240105205730.bk26eet2uayk4flv@awork3.anarazel.de Whole thread Raw
In response to	Re: Emit fewer vacuum records by reaping removable tuples during pruning (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Emit fewer vacuum records by reaping removable tuples during pruning Re: Emit fewer vacuum records by reaping removable tuples during pruning
List	pgsql-hackers

Tree view

Hi,

On 2024-01-05 15:23:12 -0500, Robert Haas wrote:
> On Fri, Jan 5, 2024 at 3:05 PM Andres Freund <andres@anarazel.de> wrote:
> > An aside:
> >
> > As I think we chatted about before, I eventually would like the option to
> > remove index entries for a tuple during on-access pruning, for OLTP
> > workloads. I.e. before removing the tuple, construct the corresponding index
> > tuple, use it to look up index entries pointing to the tuple. If all the index
> > entries were found (they might not be, if they already were marked dead during
> > a lookup, or if an expression wasn't actually immutable), we can prune without
> > the full index scan.  Obviously this would only be suitable for some
> > workloads, but it could be quite beneficial when you have huge indexes.  The
> > reason I mention this is that then we'd have another source of marking items
> > unused during pruning.
>
> I will be astonished if you can make this work well enough to avoid
> huge regressions in plausible cases. There are plenty of cases where
> we do a very thorough job opportunistically removing index tuples.

These days the AM is often involved with that, via
table_index_delete_tuples()/heap_index_delete_tuples(). That IIRC has to
happen before physically removing the already-marked-killed index entries. We
can't rely on being able to actually prune the heap page at that point, there
might be other backends pinning it, but often we will be able to. If we were
to prune below heap_index_delete_tuples(), we wouldn't need to recheck that
index again during "individual tuple pruning", if the to-be-marked-unused heap
tuple is one of the tuples passed to heap_index_delete_tuples(). Which
presumably will be very commonly the case.

At least for nbtree, we are much more aggressive about marking index entries
as killed, than about actually removing the index entries. "individual tuple
pruning" would have to look for killed-but-still-present index entries, not
just for "live" entries.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Melanie Plageman
Date: 05 January 2024, 20:34:22
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Tom Lane
Date: 05 January 2024, 21:36:07
Subject: Re: Fix bogus Asserts in calc_non_nestloop_required_outer

Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers

Previous

Next