Hi,
On 2023-01-18 13:42:40 -0800, Andres Freund wrote:
> The real point of change appears to be 10->11.
>
> There's a relevant looking difference in the vac_estimate_reltuples call:
> 10:
> /* now we can compute the new value for pg_class.reltuples */
> vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,
> nblocks,
> vacrelstats->tupcount_pages,
> num_tuples);
>
> 11:
> /* now we can compute the new value for pg_class.reltuples */
> vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel,
> nblocks,
> vacrelstats->tupcount_pages,
> live_tuples);
> which points to:
>
> commit 7c91a0364fcf5d739a09cc87e7adb1d4a33ed112
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date: 2018-03-22 15:47:29 -0400
>
> Sync up our various ways of estimating pg_class.reltuples.
The problem with the change is here:
/*
* Okay, we've covered the corner cases. The normal calculation is to
* convert the old measurement to a density (tuples per page), then
* estimate the number of tuples in the unscanned pages using that figure,
* and finally add on the number of tuples in the scanned pages.
*/
old_density = old_rel_tuples / old_rel_pages;
unscanned_pages = (double) total_pages - (double) scanned_pages;
total_tuples = old_density * unscanned_pages + scanned_tuples;
return floor(total_tuples + 0.5);
Because we'll re-scan the pages for not-yet-removable rows in subsequent
vacuums, the next vacuum will process the same pages again. By using
scanned_tuples = live_tuples, we basically remove not-yet-removable tuples
from reltuples, each time.
The commit *did* try to account for that to some degree:
+ /* also compute total number of surviving heap entries */
+ vacrelstats->new_rel_tuples =
+ vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
but new_rel_tuples isn't used for pg_class.reltuples or pgstat.
This is pretty nasty. We use reltuples for a lot of things. And while analyze
might fix it sometimes, that won't reliably be the case, particularly when
there are repeated autovacuums due to a longrunning transaction - there's no
cause for auto-analyze to trigger again soon, while autovacuum will go at it
again and again.
Greetings,
Andres Freund