Re: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)?
Date
Msg-id CAH2-WzmfddQ18Su7Zp6nnBnb3CMRT8Stv8Zg=ic1OmZcRt7Q=A@mail.gmail.com
Whole thread Raw
In response to Re: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why does pg_class.reltuples count only live tuples in indexes (after VACUUM runs)?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Apr 18, 2022 at 12:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If the planner looks at index reltuples at all, it's doing so
> for cost estimation purposes, where the count including dead
> entries is probably the right thing to use.

Then why does heapam_index_build_range_scan do it the other way around?

I think that it probably doesn't matter that much in practice. The
inconsistency should be noted in update_relstats_all_indexes, though.

> If you want to make this cleaner, maybe there's a case for
> splitting reltuples into two columns.  But then index AMs
> would be on the hook to determine how many of their entries
> are live, which is not really an index's concern.

The main concern behind this is that we're using
vacrel->new_rel_tuples for the IndexVacuumInfo.num_heap_tuples value
in amvacuumcleanup (but not in ambulkdelete), which is calculated
towards the end of lazy_scan_heap, like so:

    /*
     * Also compute the total number of surviving heap entries.  In the
     * (unlikely) scenario that new_live_tuples is -1, take it as zero.
     */
    vacrel->new_rel_tuples =
        Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
        vacrel->missed_dead_tuples;

I think that this doesn't really belong here; new_rel_tuples should
only be used for VACUUM VERBOSE/server log output, once we return to
heap_vacuum_rel from lazy_scan_heap. We should use
vacrel->new_live_tuples as our IndexVacuumInfo.num_heap_tuples value
in the amvacuumcleanup path (instead of new_rel_tuples). That way the
rule about IndexVacuumInfo.num_heap_tuples is simple: it's always
taken from pg_class.reltuples (for the heap rel). Either the existing
value, or the new value.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: Postgres perl module namespace
Next
From: Simon Riggs
Date:
Subject: Dump/Restore of non-default PKs