Surprising dead_tuple_count from pgstattuple - Mailing list pgsql-hackers

From Gordon Shannon
Subject Surprising dead_tuple_count from pgstattuple
Date
Msg-id 1281119470961-2266955.post@n5.nabble.com
Whole thread Raw
Responses Re: Surprising dead_tuple_count from pgstattuple
List pgsql-hackers
This is an expansion of the question I posed in this thread:

http://postgresql.1045698.n5.nabble.com/Need-help-understanding-vacuum-verbose-output-tp2265895p2266912.html

I am framing the question here in relation to pgstattuple.  Running 8.4.4 on
Centos.

I have a table T with 5,063,463 rows.   It was just restored from a backup,
and there is no other activity in this database.  I ran a vacuum.

pg_stat_user_tables.n_dead_tup (which is really
pg_stat_get_dead_tuples('T'::regclass::oid)) says 0
pgstattuple says dead_tuple_count=0, free_space=1,355,152

1. I delete 10,000 rows.

pg_stat_user_tables.n_live_tup -> 5053463
pg_stat_user_tables.n_dead_tup -> 10000
pgstattuple.dead_tuple_count -> 10000
pgstattuple.free_space -> 1355152

So far, so good. pgstattuple is counting the dead tuples, and not including
those tuples in the free space count.

2. I delete 15,000 more rows.

pg_stat_user_tables.n_live_tup -> 5038463
pg_stat_user_tables.n_dead_tup -> 25000
pgstattuple.dead_tuple_count -> 15000 ??
pgstattuple.free_space -> 1996904 ??

pgstattuple now appears to count the earlier 10K deleted tuples as no longer
dead, but free space.

3. I delete 50,000 more rows.

pg_stat_user_tables.n_live_tup -> 4988463
pg_stat_user_tables.n_dead_tup -> 75000
pgstattuple.dead_tuple_count -> 50022 ??
pgstattuple.free_space -> 2966628 ??

Same thing, pgstattuple appears to "see" only the most recent delete
transaction (but off by 22), and count the prior ones as free.

4. vacuum verbose

vacuum verbose t;
INFO:  vacuuming "public.t"
INFO:  scanned index "t_pkey" to remove 75000 row versions
DETAIL:  CPU 0.01s/0.38u sec elapsed 0.40 sec.
INFO:  "t": removed 75000 row versions in 637 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "t_pkey" now contains 4988463 row versions in 13886 pages
DETAIL:  75000 index row versions were removed.
204 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "t": found 50022 removable, 3696 nonremovable row versions in 668 out
of 51958 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.01s/0.39u sec elapsed 0.40 sec.
VACUUM
Time: 482.771 ms

It seems relevant that vacuum reports the same incorrect number -- 50022 --
as part of its output.  That makes me think that pgstattuple may be using
similar logic to get its dead tuple count.
I wonder if the key to this is that pgstattuple uses
HeapTupleSatisfiesVisibility() to test for deadness.  If so, why would this
call return apparently false positives?  

I know that pgstattuple is meant to be used for debugging only. I have found
pgstatindex to be very helpful in identifying bloat in my indexes.  
Per Tom in the other thread, I now understand that the "found 50022
removable, 3696 nonremovable...." line is referring to the subset of pages
that it scanned looking for dead tuples.

I keep coming back to this, though -- 50,022 seems to be just wrong, or
perhaps simply misleading -- i.e. way too low. 
It's present in the output of vacuum, and the output of pgstattuple.
I'd like to understand what meaning this number has, and, ideally, how I can
use to to detect things like bloat or fragmentation.

Thanks!

-- 
View this message in context:
http://postgresql.1045698.n5.nabble.com/Surprising-dead-tuple-count-from-pgstattuple-tp2266955p2266955.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: including backend ID in relpath of temp rels - updated patch
Next
From: Tom Lane
Date:
Subject: Re: Initial review of xslt with no limits patch