Re: decoupling table and index vacuum - Mailing list pgsql-hackers

From Andres Freund
Subject Re: decoupling table and index vacuum
Date
Msg-id 20210422185610.35gjmmxtan2ooyrg@alap3.anarazel.de
Whole thread Raw
In response to Re: decoupling table and index vacuum  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2021-04-22 14:47:14 -0400, Robert Haas wrote:
> On Thu, Apr 22, 2021 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Right. Given decoupling index vacuuming, I think the index’s garbage
> > statistics are important which preferably need to be fetchable without
> > accessing indexes. It would be not hard to estimate how many index
> > tuples might be able to be deleted by looking at the dead TID fork but
> > it doesn’t necessarily match the actual number.
> 
> Right, and to appeal (I think) to Peter's quantitative vs. qualitative
> principle, it could be way off. Like, we could have a billion dead
> TIDs and in one index the number of index entries that need to be
> cleaned out could be 1 billion and in another index it could be zero
> (0). We know how much data we will need to scan because we can fstat()
> the index, but there seems to be no easy way to estimate how many of
> those pages we'll need to dirty, because we don't know how successful
> previous opportunistic cleanup has been.

That aspect seems reasonably easy to fix: We can start to report the
number of opportunistically deleted index entries to pgstat. At least
nbtree already performs the actual deletion in bulk and we already have
(currently unused) space in the pgstat entries for it, so I don't think
it'd meanginfully increase overhead. And it'd improve insight in how
indexes operate significantly, even leaving autovacuum etc aside.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: decoupling table and index vacuum
Next
From: Peter Geoghegan
Date:
Subject: Re: decoupling table and index vacuum