Re: Surprising dead_tuple_count from pgstattuple - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Surprising dead_tuple_count from pgstattuple
Date
Msg-id 22447.1281485967@sss.pgh.pa.us
Whole thread Raw
In response to Re: Surprising dead_tuple_count from pgstattuple  (Gordon Shannon <gordo169@gmail.com>)
List pgsql-hackers
Gordon Shannon <gordo169@gmail.com> writes:
> - Because my table has no indexes, lazy_scan_heap calls lazy_vacuum_page
> directly for each block, and reports the variable tups_vacuumed ("removed
> 200 row versions in 2 pages").  However, tups_vacuumed is computed without
> counting the 100 LP_DEAD tuples, because per the code comment, it thinks to
> do so would be double-counting. Perhaps the output should say something
> like:

>     removed 300 row versions (200 were recently deleted and 100 were previously
> deleted).

Well, the problem is to tell which LP_DEAD rows are "recently" deleted.
I agree the output leaves something to be desired, but I don't see how
to improve it without tracking a lot more state than we do now.  It's
not clear that it's really worth much effort; how many people look at
VACUUM VERBOSE output at all, let alone closely?

> Whatever the phrasing, I don't know why 200 is the most significant number
> here, and 300 isn't mentioned at all.

IIRC, the reason for choosing to do it that way is that 200, and not
300, is the best indication of the amount of space reclaimed.  LP_DEAD
tuple headers don't take a lot of space.  We're trying to give an idea
of how many "real" tuples got reclaimed during vacuum.

> I would like to respectfully suggest that the vacuum output line "removed xx
> row versions in yy pages" should show the same counts regardless of whether
> or not there's an index on the table.

But in fact the presence of an index does affect the behavior, above and
beyond vacuum's counting or failure to count, because it changes what HOT
updating and HOT pruning will do.  So the initial state that vacuum is
dealing with could well be different.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 8.3 to 8.4 Upgrade issues
Next
From: Mark Kirkwood
Date:
Subject: Re: Measuring execution time