Re: Remaining case where reltuples can become distorted across multiple VACUUM operations - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Remaining case where reltuples can become distorted across multiple VACUUM operations
Date
Msg-id CAEze2Whac7c5cqo5pzRVrZn1j9t_L-Bz2meD3e+tFPkzx2prJg@mail.gmail.com
Whole thread Raw
In response to Re: Remaining case where reltuples can become distorted across multiple VACUUM operations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Remaining case where reltuples can become distorted across multiple VACUUM operations
List pgsql-hackers
On Mon, 8 Aug 2022 at 17:26, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Mon, Aug 8, 2022 at 8:14 AM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > I do not have intimate knowledge of this code, but shouldn't we also
> > add some sefety guarantees like the following in these blocks? Right
> > now, we'll keep underestimating the table size even when we know that
> > the count is incorrect.
> >
> > if (scanned_tuples > old_rel_tuples)
> >     return some_weighted_scanned_tuples;
>
> Not sure what you mean -- we do something very much like that already.
>
> We take the existing tuple density, and assume that that hasn't
> changed for any unscanned pages -- that is used to build a total
> number of tuples for the unscanned pages. Then we add the number of
> live tuples/scanned_tuples that the vacuumlazy.c caller just
> encountered on scanned_pages. That's often where the final reltuples
> value comes from.

Indeed we often apply this, but not always. It is the default case,
but never applied in the special cases.

For example, if currently the measured 2% of the pages contains more
than 100% of the previous count of tuples, or with your patch the last
page contains more than 100% of the previous count of the tuples, that
new count is ignored, which seems silly considering that the vacuum
count is supposed to be authorative.

Kind regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Remaining case where reltuples can become distorted across multiple VACUUM operations
Next
From: Andrew Dunstan
Date:
Subject: Re: bug on log generation ?