Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date
Msg-id CAH2-Wz=Z5W6YpZU5=m-Qh-DaYV9k9qK=dbxMSRBhbc75T1ZZQA@mail.gmail.com
Whole thread Raw
In response to Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Andres Freund <andres@anarazel.de>)
Responses Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
List pgsql-hackers
On Wed, Jan 18, 2023 at 2:22 PM Andres Freund <andres@anarazel.de> wrote:
> The problem with the change is here:
>
>         /*
>          * Okay, we've covered the corner cases.  The normal calculation is to
>          * convert the old measurement to a density (tuples per page), then
>          * estimate the number of tuples in the unscanned pages using that figure,
>          * and finally add on the number of tuples in the scanned pages.
>          */
>         old_density = old_rel_tuples / old_rel_pages;
>         unscanned_pages = (double) total_pages - (double) scanned_pages;
>         total_tuples = old_density * unscanned_pages + scanned_tuples;
>         return floor(total_tuples + 0.5);

My assumption has always been that vac_estimate_reltuples() is prone
to issues like this because it just doesn't have access to very much
information each time it runs. It can only see the delta between what
VACUUM just saw, and what the last VACUUM (or possibly the last
ANALYZE) saw according to pg_class. You're always going to find
weaknesses in such a model if you go looking for them. You're always
going to find a way to salami slice your way from good information to
total nonsense, if you pick the right/wrong test case, which runs
VACUUM in a way that allows whatever bias there may be to accumulate.
It's sort of like the way floating point values can become very
inaccurate through a process that allows many small inaccuracies to
accumulate over time.

Maybe you're right to be concerned to the degree that you're concerned
-- I'm not sure. I'm just adding what I see as important context.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Next
From: Tom Lane
Date:
Subject: Re: Rethinking the implementation of ts_headline()