On Tue, May 13, 2025 at 05:01:02PM -0700, Hari Krishna Sunder wrote:
> We found a minor issue when testing statistics import with upgrading from
> versions older than v14. (We have VACUUM and ANALYZE disabled)
> 3d351d916b20534f973eda760cde17d96545d4c4
> <https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4>
> changed
> the default value for reltuples from 0 to -1. So when such tables are
> imported they get the pg13 default of 0 which in pg18 is treated
> as "vacuumed and seen to be empty" instead of "never yet vacuumed". The
> planner then proceeds to pick seq scans even if there are indexes for these
> tables.
> This is a very narrow edge case and the next VACUUM or ANALYZE will fix it
> but the perf of these tables immediately after the upgrade is considerably
> affected.
There was a similar report for vacuumdb's new --missing-stats-only option.
We fixed that in commit 9879105 by removing the check for reltuples != 0,
which means that --missing-stats-only will process empty tables.
> Can we instead use -1 if the version is older than 14, and reltuples is 0?
> This will have the unintended consequence of treating a truly empty table
> as "never yet vacuumed", but that should be fine as empty tables are going
> to be fast regardless of the plan picked.
I'm inclined to agree that we should do this. Even if it's much more
likely that 0 means empty versus not-yet-processed, the one-time cost of
processing some empty tables doesn't sound too bad. In any case, since
this only applies to upgrades from <v14, that trade-off should dissipate
over time.
> PS: This is my first patch, so apologies for any issues with the patch.
It needs a comment, but otherwise it looks generally reasonable to me after
a quick glance.
--
nathan